
arXiv:2510.10774v3 Announce Type: replace-cross Abstract: Persian remains substantially underrepresented in open speech-text resources, limiting progress in multi-speaker text-to-speech (TTS), speech-language modelling, and low-resource speech processing. We introduce ParsVoice, the largest publicly available Persian speech-text corpus tailored for training multi-speaker TTS systems, along with a scalable pipeline to construct high-quality speech-text data from long-form audiobook recordings. The pipeline combines a fine-tuned ParsBERT sentence-completion classifier, ASR-based boundary optimiz
The release of ParsVoice addresses a critical gap in open-source AI resources for less-resourced languages, coinciding with a global push for more inclusive and diverse AI development.
This development is crucial for nations and regions seeking to develop their own AI capabilities and reduce dependency on models trained exclusively on dominant languages, fostering digital sovereignty.
The availability of a large-scale Persian speech corpus will significantly enable the development of advanced multi-speaker Text-to-Speech (TTS) systems and other speech technologies for the Persian language, previously lagging behind major languages.
- · Iranian tech companies
- · Persian-speaking populations
- · AI researchers in low-resource languages
- · NLP/TTS developers
Improved AI applications and services for Persian speakers, including voice assistants and accessibility tools.
Increased regional digital autonomy and reduced reliance on foreign AI infrastructure for Persian language processing.
Potential for other nations with underrepresented languages to accelerate similar domestic AI data and model development efforts.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG