SIGNALAI·Jun 4, 2026, 4:00 AMSignal55Short term

Light or Full Verb? A Minimal-Pair Dataset for Probing Phraseological Competence in Language Models

arXiv:2606.05087v1 Announce Type: new Abstract: Frequent English verbs such as 'have' and 'make' can function either as collocates in light-verb constructions or as full lexical predicates, as in 'make a decision' vs. 'make a cake'. Whether language models represent this distinction remains unclear. We introduce a large-scale controlled dataset of minimally varying English sentence series in which the same context contains the same verb in light-verb and full-verb uses. Two probing experiments show that language models differentiate between these uses even in minimal contexts and exhibit separ

Why this matters

Why now

The proliferation of advanced language models necessitates deeper understanding of their linguistic competence.

Why it’s important

This research contributes to understanding the nuanced capabilities of language models, which is crucial for advancing AI agent development and robust AI applications.

What changes

We now have clearer empirical evidence that language models can differentiate between subtle linguistic constructions like light-verb vs. full-verb uses in English.

Winners

· AI researchers
· NLP developers
· AI ethics and safety

Losers

· Oversimplified views of LLM capabilities

Second-order effects

Direct

Improved understanding of language model internal representations.

Second

Development of more linguistically sophisticated and less error-prone AI agents.

Third

Enhanced trust and broader adoption of AI systems in complex linguistic tasks.

Editorial confidence: 90 / 100 · Structural impact: 40 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.CL

#cs.CL

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.