ScholarQuest: A Taxonomy-Guided Benchmark for Agentic Academic Paper Search in Open Literature Environments

arXiv:2606.20235v1 Announce Type: cross Abstract: Academic paper search is a core step in scientific research, and LLM-based search agents are emerging as a promising paradigm for iterative, intent-driven literature exploration. However, existing benchmarks are insufficient for systematically evaluating agentic academic search under realistic open literature environments. We propose ScholarQuest, a large-scale, taxonomy-guided benchmark for agentic academic paper search. ScholarQuest is constructed from over 1,000 computer science topics and four representative research intents, including meth
The development of LLM-based search agents necessitates robust evaluation metrics as these technologies mature and become more integrated into scientific research workflows.
This benchmark provides a critical tool for measuring and improving the efficacy of AI agents in academic research, which will accelerate scientific discovery and knowledge synthesis.
The ability to systematically evaluate and compare agentic academic search systems under realistic conditions will lead to more effective and trustworthy AI tools for scientists.
- · AI agent developers
- · Academic researchers
- · Scientific publishers
- · LLM providers
- · Manual literature review services
Improved performance and broader adoption of AI agents for academic paper search will occur.
Accelerated rates of scientific discovery and interdisciplinary connections due to more efficient knowledge access will be observed.
The role of human researchers may shift from extensive search to higher-level analysis and synthesis, supported by advanced AI tools.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI