Train, Retrieve, or Both? A Four-Arm Head-to-Head for Correct Statutory Citation on the Ontario Residential Tenancies Act

arXiv:2606.20359v1 Announce Type: new Abstract: Self-represented tenants, landlords, and help-desk staff need to be pointed at the provision of law that actually governs a question, with a correct statutory citation. We study this task on the Ontario Residential Tenancies Act, 2006 (RTA) and its core regulation, asking the operator's question empirically: is fine-tuning enough, or is hybrid retrieval needed? We run a four-arm head-to-head on Qwen2.5-7B-Instruct (base zero-shot, LoRA SFT-only, RAG-only, and an SFT+RAG hybrid), scored on citation exact-match (section+subsection) over a small, hu
The proliferation of complex legal information online and the increasing accessibility of large language models are creating a demand for precise legal guidance for non-experts.
Improving the accuracy of AI in legal citation directly impacts access to justice and reduces friction in administrative processes, potentially lowering costs and increasing efficiency.
This research provides empirical evidence on the optimal strategies (fine-tuning, RAG, or hybrid) for AI to achieve high fidelity in statutory citation, guiding future development in specialized legal AI.
- · Legal tech companies
- · Self-represented individuals
- · Help-desk staff (legal/government)
- · Jurisdictions with clear, digitized statutes
- · Inefficient manual legal research processes
- · AI models without specialized fine-tuning or RAG capabilities
Increased operational efficiency for administrative legal tasks and reduced error rates in legal information provision.
Greater trust in AI-powered legal assistance tools for initial guidance, potentially democratizing access to basic legal information.
The development of highly specialized, domain-specific AI models that combine various techniques for superior performance in complex regulatory environments beyond law.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG