DSIRM: Learning Query-Bridged Discrete Semantic Identifiers for E-commerce Relevance Modeling

arXiv:2606.04374v1 Announce Type: cross Abstract: Despite rapid progress of continuous embeddings for e-commerce search relevance, a long-standing open problem is the difficulty in capturing fine-grained attribute distinctions. While discrete Semantic Identifiers (SIDs) have been widely adopted as a promising alternative, existing SID generation methods rely heavily on unsupervised quantization. In realistic scenarios, the lack of explicit supervision often makes it more difficult to dictate which items should share an SID, resulting in limited capability for query-dependent ranking. To addres
The proliferation of e-commerce and the increasing complexity of user queries necessitate more sophisticated relevance modeling techniques that move beyond traditional continuous embeddings.
Improving e-commerce search relevance directly impacts sales, user experience, and the overall efficiency of online marketplaces, making this a critical area for competitive advantage.
This paper proposes a method to learn query-bridged discrete semantic identifiers, which aims to provide more granular and context-aware product distinctions compared to prior unsupervised quantization methods.
- · E-commerce platforms
- · AI/ML researchers in information retrieval
- · Companies with large product catalogs
- · Platforms relying solely on naive continuous embeddings
- · Users experiencing irrelevant search results
Improved e-commerce search accuracy and user satisfaction.
Increased sales conversion rates and reduced product return rates for e-commerce businesses.
Enhanced ability for smaller vendors with unique products to be discovered within large marketplaces.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI