LEDGER: A Long-Context Benchmark of Corporate Annual Reports for Grounded Financial Retrieval and Extraction

arXiv:2606.13100v1 Announce Type: new Abstract: Finance reporting is a natural proving ground for large language models, and the very-long-context capabilities of recent models across all sizes make rigorous evaluation in this domain an increasingly pressing need. Yet most public financial resources reduce the task to plain-text SEC 10-K filings paired with a handful of question-answer items. We release LEDGER (Long-context Evaluation of Documents for Grounded Extraction and Retrieval), a corpus of 4,999 digitized corporate annual reports - full documents with figures, tables, and narrative, n
The rapid advancement of large language models, particularly their extended context windows, creates an immediate need for financial domain-specific benchmarks to validate their utility.
This new benchmark provides a rigorous, long-context evaluation crucial for developing and deploying AI agents capable of accurate financial retrieval and extraction from complex annual reports.
The availability of LEDGER shifts the focus from simple text analysis of SEC filings to comprehensive AI understanding of full corporate reports, including figures and tables, providing a more robust testing ground.
- · AI developers focused on finance
- · Financial analysts using AI
- · Large language model providers
- · AI models with short context windows
- · Traditional financial data extraction services
Improved performance and reliability of AI systems for financial analysis and reporting.
Increased automation of white-collar financial tasks, potentially reducing the need for human data extraction.
New financial products and services enabled by deeper, more accurate AI-driven insights from corporate disclosures.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL