
arXiv:2605.26902v1 Announce Type: cross Abstract: Generative retrieval (GR) maps queries directly to document identifiers (docids) using parametric knowledge, However, this design makes corpus expansion costly: adding new documents requires updating model parameters to encode new document-docid associations incurs repeated training and catastrophic forgetting of previously indexed documents. In this work, we revisit incremental GR as an in-context retrieval problem, where newly added documents are supplied as inference-time document-docid evidence. We propose ICICLE, an in-context indexing fra
The rapid expansion of AI models and the increasing need for real-time, flexible information retrieval are pushing research into more efficient and adaptable methods for corpus expansion.
This development addresses a key limitation in generative retrieval, enabling more cost-effective and dynamic handling of new information, which is critical for continuously updated AI knowledge bases.
The paradigm shifts from costly re-training for corpus expansion to in-context retrieval, making AI systems more adaptable and scalable for integrating new data.
- · AI developers
- · Generative AI platforms
- · Data-intensive industries
- · Search engine companies
- · Companies reliant on static, infrequently updated AI models
AI models will become more adept at incorporating new information without incurring significant retraining costs.
This improved adaptability could accelerate the development and deployment of more current and relevant AI applications across various sectors.
The reduced barrier to corpus expansion might lead to a proliferation of specialized, rapidly evolving AI knowledge bases, creating new competitive landscapes.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI