arXiv:2605.23017v1 Announce Type: new Abstract: One prominent method of evaluating machine learning model trustworthiness is the notion of calibration. In the binary outcome setting, a probabilistic predictor is calibrated if outcomes are realized according to a model's distributional prediction, conditioned on this prediction. Straightforward extensions of binary calibration definitions to probabilistic multiclass classifiers suffer from an exponential complexity blowup as the space of predictions grows exponentially in the number of classes $n$. As a remedy, Noarov and Roth (2023) propose mu
Source: arXiv cs.LG — read the full report at the original publisher.
