Beyond Rebalancing: Benchmarking Binary Classifiers Under Class Imbalance Without Rebalancing Techniques

arXiv:2509.07605v2 Announce Type: replace-cross Abstract: Class imbalance poses a significant challenge to supervised classification, particularly in critical domains like medical diagnostics and anomaly detection where minority class instances are rare. While numerous studies have explored rebalancing techniques to address this issue, less attention has been given to evaluating the performance of binary classifiers under imbalance when no such techniques are applied. Therefore, the goal of this study is to assess the performance of binary classifiers "as-is", without performing any explicit r
This research is timely as AI applications proliferate into critical domains where class imbalance is common, necessitating robust and transparent evaluation methods.
Understanding classifier performance without rebalancing is crucial for deploying reliable AI systems in sensitive areas like medical diagnostics and anomaly detection, where 'as-is' performance directly impacts real-world outcomes.
This study encourages a more rigorous 'baseline' understanding of classification models, potentially leading to increased scrutiny of reported performance metrics that rely heavily on often-unmentioned rebalancing techniques.
- · AI ethicists
- · Healthcare AI providers
- · Cybersecurity AI developers
- · AI models with subtle class imbalance vulnerabilities
- · Developers relying solely on rebalancing for performance metrics
Improved understanding of binary classifier limitations under class imbalance when no rebalancing techniques are used.
Increased demand for inherently robust classification algorithms that perform well under varied data distributions without requiring explicit data manipulation.
Potential shifts in regulatory scrutiny towards AI models deployed in critical applications, emphasizing baseline performance and transparency regarding data preprocessing.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI