
arXiv:2606.07523v1 Announce Type: cross Abstract: Legal domains in high-resource languages like English have widely adopted artificial intelligence for legal question answering. However, data scarcity in low resource languages such as Nepali has limited the training of large language models on Nepali legal texts. This study presents the first application of a Retrieval Augmented Generation based model for Nepali legal question answering using case laws extracted from the Nepal Kanun Patrika digital archive. Using BM25 on chunked documents, the approach achieved a top precision at one of 91 per
The increasing maturity of large language models and the growing global interest in AI application naturally extends to addressing data scarcity in low-resource languages.
This development indicates progress in making powerful AI tools accessible to larger linguistic populations, especially in critical sectors like law, reinforcing the trend towards localized AI solutions.
The ability to apply advanced AI, specifically Retrieval Augmented Generation, to legal domains in low-resource languages like Nepali means previously inaccessible data can now be processed and queried efficiently.
- · Nepal (legal sector)
- · Low-resource language communities
- · AI researchers in natural language processing
- · Legal tech companies
Legal professionals in Nepal gain access to more efficient legal research tools and data analysis capabilities.
Similar RAG frameworks are likely to be developed and deployed for other low-resource languages and specialized domains.
This could lead to a digital transformation of legal systems in developing nations, potentially influencing legal education and policy-making.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI