arXiv:2606.15733v1 Announce Type: new Abstract: Instruction-tuned language models can answer the same causal-reasoning question differently after its English variable names are replaced by type-preserving placeholders, although the structural causal model and the gold answer are unchanged. We ask whether this lexical gap reflects information loss in the placeholder view or a misaligned read-out from a representation that still carries answer-relevant content. Vernier uses a paired-view weight update as an instrument and then inspects the mechanism left after the gap closes. In the working regi

Source: arXiv cs.CL — read the full report at the original publisher.

This is a curated wire item. The Continuum Brief does not republish full third-party articles; this entry links to the original source.