
arXiv:2508.07952v2 Announce Type: replace Abstract: Clustering algorithms often assume all features contribute equally to the data structure, an assumption that usually fails in high-dimensional or noisy settings. Feature weighting methods can address this, but most require additional parameter tuning. We propose SHARK (Shapley Reweighted $k$-means), a feature-weighted clustering algorithm motivated by the use of Shapley values from cooperative game theory to quantify feature relevance, which requires no additional parameters beyond those in $k$-means. We prove that the $k$-means objective can
The proliferation of high-dimensional and noisy data necessitates more robust and parameter-free clustering methods in machine learning, driving innovation in this area.
Improved, more robust clustering algorithms with fewer hyperparameters reduce the technical barrier and risk in AI development, leading to more reliable and accessible applications.
Clustering models can now be deployed with greater reliability and less fine-tuning in data-rich environments, potentially accelerating AI development and adoption in complex domains.
- · Machine Learning Researchers
- · Data Scientists
- · AI-driven industries
- · Cloud Computing Providers
- · Developers of less robust clustering algorithms
- · Specialists in complex hyperparameter tuning
Wider adoption and easier application of clustering techniques in high-dimensional data problems.
Reduced computational overhead and expertise required for effective data analysis and model training.
Acceleration of discovery in fields reliant on unsupervised learning, potentially leading to new scientific or commercial breakthroughs.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG