Constrained Semantic Decompression in LLMs through Persian Proverb-Conditioned Story Generation

arXiv:2606.12599v1 Announce Type: new Abstract: Transforming a dense, abstract proverb into an engaging and morally faithful narrative requires deep cultural understanding and robust semantic grounding. We frame this problem as a \emph{constrained semantic decompression} task and study proverb-conditioned story generation as a testbed for abstraction-to-realization in large language models (LLMs). Focusing on Persian, we introduce the Proverb Aligned Narrative Dataset (PAND), pairing proverbs with human-written stories and explicit meanings. By a hybrid evaluation framework that combines human
The rapid advancements in large language models necessitate increasingly sophisticated methods for grounding AI in cultural context and complex reasoning, making constrained semantic decompression a timely area of focus.
This research addresses a critical challenge in AI development: enabling models to generate culturally nuanced and contextually accurate narratives from abstract concepts, which is vital for broader AI adoption and trustworthiness.
The explicit framing of 'constrained semantic decompression' as a task and the introduction of a new dataset (PAND) provide a dedicated framework for evaluating and improving LLM capabilities in cultural understanding and narrative generation.
- · AI researchers
- · Cultural preservation initiatives
- · Creative content industries
- · Educational technology
- · LLMs lacking cultural grounding
- · Generic content generation platforms
Improved performance of LLMs in culturally-specific content creation and understanding.
Increased demand for culturally diverse and nuanced datasets for AI training and evaluation.
Enhanced ability for AI to act as a bridge for cross-cultural communication and understanding, potentially fostering greater global empathy.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL