Light or Full Verb? A Minimal-Pair Dataset for Probing Phraseological Competence in Language Models

arXiv:2606.05087v1 Announce Type: new Abstract: Frequent English verbs such as 'have' and 'make' can function either as collocates in light-verb constructions or as full lexical predicates, as in 'make a decision' vs. 'make a cake'. Whether language models represent this distinction remains unclear. We introduce a large-scale controlled dataset of minimally varying English sentence series in which the same context contains the same verb in light-verb and full-verb uses. Two probing experiments show that language models differentiate between these uses even in minimal contexts and exhibit separ
The proliferation of advanced language models necessitates deeper understanding of their linguistic competence.
This research contributes to understanding the nuanced capabilities of language models, which is crucial for advancing AI agent development and robust AI applications.
We now have clearer empirical evidence that language models can differentiate between subtle linguistic constructions like light-verb vs. full-verb uses in English.
- · AI researchers
- · NLP developers
- · AI ethics and safety
- · Oversimplified views of LLM capabilities
Improved understanding of language model internal representations.
Development of more linguistically sophisticated and less error-prone AI agents.
Enhanced trust and broader adoption of AI systems in complex linguistic tasks.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL