Quantifying the Affective Gap: A Zero-Shot Evaluation of LLMs on Fine-Grained Emotion Taxonomies

arXiv:2607.00968v1 Announce Type: new Abstract: Emotion recognition in natural language is a foundational challenge in affective computing, with critical implications for human-computer interaction, mental health support, and conversational AI. This paper presents a rigorous, unified zero-shot evaluation of three leading commercial large language models: Claude (claude-sonnet-4-6), ChatGPT (GPT-5.4), and Gemini (gemini-2.5-flash). The models were queried through their respective production APIs as of April 2026 on a fine-grained 13-class emotion classification task. Using a stratified 1,000-se
The rapid advancement and widespread deployment of large language models necessitate continuous and more granular evaluation to understand their capabilities and limitations in complex tasks like zero-shot emotion recognition.
Precise understanding of LLM emotional intelligence is critical for developing more sophisticated human-AI interaction, mental health applications, and ethical AI systems, impacting their utility across numerous sectors.
We now have a rigorous benchmark for comparing the zero-shot fine-grained emotion recognition capabilities of leading commercial LLMs, providing clearer insights into their current performance and areas for improvement.
- · AI developers
- · Affective computing researchers
- · Mental health AI applications
- · Companies with proprietary fine-tuning data
- · LLMs with poor emotion recognition
- · Simple rule-based emotion detection systems
- · Companies relying on outdated emotional AI models
This evaluation will likely spur further research and development into improving LLM's understanding and generation of human emotion.
Improved emotional intelligence in LLMs could lead to more empathetic and nuanced conversational AI agents, enhancing user experience and trust.
The ability of LLMs to accurately interpret fine-grained emotions could enable automated psychological screening tools and therapeutic aids, raising new ethical and privacy concerns.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL