
arXiv:2606.13189v1 Announce Type: new Abstract: Prompt-based LLMs are increasingly used for stance detection, but harder examples are not always repaired by clearer instructions, reasoning prompts, retrieval, or debate. We introduce SICI (Stance Inference Complexity Index), a seven-dimensional diagnostic measure of the semantic-pragmatic burden imposed by a target--text pair. Across SemEval-2016 and VAST, SICI predicts LLM accuracy better than surface proxies and shows substantial cross-scorer reliability ($\alpha=0.771$). More importantly, LLM errors change regime as SICI increases: low-compl
The proliferation of LLMs for nuanced tasks like stance detection necessitates more robust diagnostic tools to understand their limitations and failure modes.
A strategic reader should care about understanding predictable failure points in LLMs, especially as their deployment in high-stakes informational environments expands.
We now have a quantifiable measure, SICI, that correlates with LLM accuracy and reveals distinct error regimes, moving beyond surface-level evaluations.
- · AI researchers
- · LLM developers
- · Organizations deploying LLMs for social analytics
- · Developers solely relying on simple prompt engineering
SICI's predictive power will lead to more targeted and effective improvements in LLM architecture and prompting strategies for complex tasks.
Improved understanding of LLM error regimes could enable dynamic prompting or model selection based on predicted task complexity.
More reliable LLM-based stance detection could influence information warfare and public opinion analysis, with both positive and negative societal implications.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL