GhazalBench: Evaluating LLM Understanding and Canonical Surface-Form Access in Persian Ghazals

arXiv:2603.09979v2 Announce Type: replace Abstract: Persian poetry plays an active role in Iranian cultural practice, where verses by canonical poets such as Hafez are frequently quoted, paraphrased, or completed from partial cues. Supporting such interactions requires language models to engage not only with poetic meaning but also with culturally canonical surface form. We introduce GhazalBench, a benchmark for evaluating how large language models (LLMs) interact with Persian ghazals under usage-grounded conditions. Unlike prior work that primarily studies memorization as a liability, GhazalB
The proliferation of LLMs and increasing interest in their application beyond dominant languages is driving the need for culturally specific and relevant evaluation benchmarks.
This benchmark highlights the crucial role of cultural context and surface-form access for LLMs, especially for non-Western languages, and signals a move towards AI systems that respect and understand localized cultural practices.
The focus for LLM development will shift to include more nuanced cultural and linguistic understanding, moving beyond purely semantic evaluation to incorporate culturally canonical forms and usage-grounded conditions.
- · AI researchers in non-English languages
- · Persian language AI developers
- · Cultural preservation initiatives via AI
- · LLMs without strong multilingual or cultural fine-tuning
- · General-purpose, 'one-size-fits-all' AI development approaches
Introduction of GhazalBench provides a specific tool for evaluating LLMs on Persian ghazals, targeting both meaning and culturally canonical surface form.
This specific benchmark will likely spur similar initiatives for other culturally rich and non-dominant languages, driving diversification in AI evaluation and development.
Increased focus on culturally specific AI applications could lead to the development of sovereign AI solutions tailored to individual nations' linguistic and cultural heritage.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL