SIGNALAI·Jul 1, 2026, 4:00 AMSignal75Short term

Measuring Judgment Quality in Natural-Language Explanations: Evidence from Forecasting Tournaments

arXiv:2606.30987v1 Announce Type: new Abstract: Decision-makers routinely rely on expert judgments accompanied by written explanations, yet explanation quality is difficult to measure at scale. Forecasting tournaments offer a natural testing ground: probabilistic judgments are paired with natural-language rationales and scored against realized outcomes. We introduce Explanation Quality Markers (EQMs), a set of sixty theory-guided reasoning patterns scored by large language models (LLMs). In a pre-registered analysis of over 55,000 forecast-rationale pairs from a multiyear forecasting tournamen

Why this matters

Why now

The proliferation of AI-generated content and expert systems necessitates robust methods to evaluate the quality and reliability of 'explanations' or 'rationales' behind decisions.

Why it’s important

This research provides a scalable, LLM-driven approach to measure the quality of natural-language explanations, which is critical for trustworthy AI adoption and evaluating expert judgment.

What changes

The ability to systematically score explanation quality using AI models could significantly enhance the development, auditing, and deployment of agentic systems and human-AI collaboration.

Winners

· AI developers focused on explainability
· Organizations relying on expert forecasting
· Auditors of AI systems
· LLM providers

Losers

· Opaque black-box AI systems
· Experts providing low-quality rationales
· Traditional, manual explanation review processes

Second-order effects

Direct

Refinement of AI agent reasoning capabilities through feedback loops on explanation quality.

Second

Increased demand for explainable AI outputs across various industries, leading to new compliance standards.

Third

Potential for 'explanation marketplaces' where valuable rationales are traded or licensed, fostering a new knowledge economy.

Editorial confidence: 90 / 100 · Structural impact: 60 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.CL

#cs.CL #econ.GN #q-fin.EC

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.