
arXiv:2603.11482v2 Announce Type: replace-cross Abstract: Evaluating 'anime-like' voices currently relies on costly subjective judgments, yet no standardized objective metric exists. A key challenge is that anime-likeness, unlike naturalness, lacks a shared absolute scale, making conventional Mean Opinion Score (MOS) protocols unreliable. To address this gap, we propose AnimeScore, a preference-based framework for automatic anime-likeness evaluation via pairwise ranking. We collect 15,000 pairwise judgments from 187 evaluators with free-form descriptions, and acoustic analysis reveals that per
The proliferation of generative AI models capable of speech synthesis creates a greater need for nuanced evaluation frameworks beyond traditional metrics.
Developing preference-based evaluation for specific stylistic elements like 'anime-likeness' allows for more precise control and improvement in generative AI audio synthesis.
The ability to objectively measure and optimize 'anime-likeness' will refine the development of particular aesthetic styles in synthetic speech, moving beyond subjective human judgment.
- · AI audio synthesis developers
- · Entertainment industry
- · Gaming sector
- · Generative AI platforms
- · Traditional voice acting for specific niche styles
Improved synthetic voice actors with highly specific stylistic attributes tailored for different media.
Expansion of AI character voice generation with finer control over stylistic nuance, reducing production costs and timelines.
The emergence of entirely new forms of media where AI-generated emotive and stylistic voices become the norm, possibly influencing human voice acting trends.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL