Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control
Best AI papers explained - A podcast by Enoch H. Kang

Categories:
This paper presents the Learn then Test (LTT) framework, a novel approach for calibrating machine learning models to provide explicit statistical guarantees on their predictions. The method works with any underlying model and data distribution without requiring retraining. LTT reframes the problem of controlling statistical errors, such as false discovery rate, intersection-over-union, and type-1 error, as a multiple hypothesis testing problem. By generating p-values for different model prediction settings (controlled by a parameter λ) and applying family-wise error rate (FWER) controlling algorithms like Bonferroni or sequential graphical testing, the framework identifies prediction settings that statistically guarantee the desired risk level. The authors demonstrate the framework's utility across various machine learning tasks, including multi-label classification, selective classification, selective regression, outlier detection, and instance segmentation, providing novel, distribution-free guarantees.