Which Leakage Types Matter? A Quantitative Landscape Across 2,047 Benchmark Datasets

arXiv:2604.04199v2 Announce Type: replace Abstract: Twenty-eight within-subject counterfactual experiments across 2,047 iid tabular datasets, plus a boundary experiment on 129 temporal datasets, measure the severity of four data leakage classes in machine learning. Class I (estimation: fitting scalers on full data) is negligible: all nine conditions produce $|{\Delta}AUC| \leq 0.005$. Class II (selection: peeking, seed cherry-picking) is substantial: the measured effect is consistent with about 90% noise exploitation inflating reported scores. Class III (memorization) scales with model capacit

Source: arXiv cs.LG — read the full report at the original publisher.

This is a curated wire item. The Continuum Brief does not republish full third-party articles; this entry links to the original source.

Stay ahead of the systems reshaping markets.