CoRe: A Continuously Reward-Finetuned LLM Query Rewriter for Multi-Stage Context-Aware Relevance in Web-Scale Video Search

arXiv:2606.14127v1 Announce Type: cross Abstract: LLM-based query rewriters in production face a tension: the training reward must reflect how the rewrite is consumed by the production ranker, yet the training procedure must be cheap enough to support continuous redeployment as data drifts. We present CoRe (Context Relevance), such a system, redeployed weekly for over five months in a major short-video search engine. Our reward uses the deployed multimodal relevance model as its source and a multiplicative ratio form mirroring the production fusion algebra, closing the simulation-production ga
The rapid deployment and continuous refinement of LLMs in production environments for critical functions like search reflects the ongoing maturation of AI applications.
This development showcases advancements in making LLMs more practical and efficient for real-world scenarios, particularly in content discovery, by addressing performance and cost challenges.
The ability to continuously fine-tune LLMs with production feedback at scale improves their relevance and adaptability, making them more effective tools for complex online systems.
- · AI-powered search engines
- · Content platforms
- · MLOps providers
- · Cloud computing providers
- · Traditional algorithmic search methods
- · Companies slow to adopt LLM-based systems
Improved user experience and engagement on platforms utilizing continuously fine-tuned LLM rewriters.
Increased competitive pressure on search and discovery platforms to integrate similar advanced AI capabilities.
The development of more sophisticated and self-optimizing AI agents operating within large-scale distributed systems.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL