
arXiv:2604.26269v2 Announce Type: replace Abstract: In the era of large language models, creative writing quality lacks a computable theoretical anchor. The dominant approaches are rubric scoring -- decomposing holistic aesthetic judgment into sub-scores -- and RLHF preference signals -- replacing quality with group votes. Both bypass the statistical structure of the text itself. This paper provides an information-theoretic foundation to fill this gap. We propose 'calibrated surprise' as the information-theoretic essence of excellent creative writing. This judgment matches reading intuition an
The proliferation of advanced large language models (LLMs) has created an urgent need for robust methods to assess creative quality beyond subjective human judgment or simplistic metrics.
Developing a computable, information-theoretic framework for creative quality can significantly advance AI's ability to generate and evaluate complex outputs, moving beyond mere statistical mimicry.
The proposed 'calibrated surprise' metric offers a foundational, objective measure for creative writing quality, potentially standardizing evaluation in a field previously dominated by subjective rubrics and preference signals.
- · AI developers
- · Creative AI platforms
- · Computational linguistics researchers
- · Traditional qualitative assessment methods
- · Rubric-based evaluation
- · RLHF as sole quality signal
AI systems will be able to more accurately self-evaluate and refine their creative outputs.
This could lead to a new generation of creative AI tools that produce demonstrably higher quality content.
The concept of 'calibrated surprise' might extend beyond text to other creative AI domains like art, music, and design, profoundly altering human-AI co-creation.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL