行业研究公司研究宏观策略财报招股书会议纪要 seedance2.0 低空经济 DeepSeek AIGC 大模型

概率机器学习中的不确定性原理

2025-06-05 - 斯坦福大学 Franky！

本文研究了概率预测中的不确定性量化问题，重点关注如何通过校准来提高不确定性估计的可信度。校准是指预测的概率与事件发生的实际频率相一致。本文提出了三种提高校准的方法：

后验校准：本文提出了模块化校准（MCC）框架，该框架统一了现有的后验校准技术，如等距回归和校准预测，并提供了对任何回归模型的校准保证。实验表明，MCC 框架能够有效地提高预测的校准度和尖锐度。
训练时校准：本文提出了可训练的校准度量，将校准问题转化为分布匹配问题，并使用最大均值差异（MMD）来进行量化。通过将校准度量作为正则化目标，可以在训练过程中同时优化校准度和尖锐度。实验表明，这些方法能够有效地提高预测的校准度和决策质量。
在线校准：本文研究了在分布漂移的情况下如何保持预测的校准度。本文将预测任务建模为预报者和自然之间的零和博弈，并基于 Blackwell 可达性理论，提出了一个通用的策略来保证校准度。实验表明，这些方法能够在动态甚至对抗性的环境中保持预测的校准度，并实现渐近无遗憾的校准。

本文的核心观点是，校准是概率预测中至关重要的性质，可以通过多种方法来提高。本文提出的方法为在实践中实现校准提供了灵活且有效的策略，从而可以提高概率预测的可靠性和可信度。

Charles MarxJune 2025 © 2025 by Charles Thomas Marx. All Rights Reserved.Re-distributed by Stanford University under license with the author. This dissertation is online at: https://purl.stanford.edu/sm978sh0523 I certify that I have read this dissertation and that, in my opinion, it is fully adequate inscope and quality as a dissertation for the degree of Doctor of Philosophy. Stefano Ermon, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate inscope and quality as a dissertation for the degree of Doctor of Philosophy. Sanmi Koyejo I certify that I have read this dissertation and that, in my opinion, it is fully adequate inscope and quality as a dissertation for the degree of Doctor of Philosophy. Volodymyr Kuleshov Approved for the Stanford University Committee on Graduate Studies. Stacey F. Bent, Vice Provost for Graduate Education Abstract Reliable uncertainty quantification is fundamental to the safe and e↵ective deployment of machinelearning systems in high-stakes settings, where predictions inform decisions ranging from medicaldiagnoses to infrastructure management and scientific discovery. This dissertation presents a studyof probabilistic prediction with a focus on making uncertainty estimates trustworthy by achievingcalibration—wherein predicted probabilities align with the empirical frequencies of events, such as a90% confidence interval including the observed outcome 90% of the time. I propose interventionsto improve calibration across the model lifecycle:training objectives to encourage calibration,post-processing methods to correct miscalibration, and online techniques for adaptively preservingcalibration during deployment in nonstationary environments. The first part addresses post-hoc recalibration.I introduce modular conformal calibration,a general framework that encompasses and extends existing post-hoc uncertainty quantificationtechniques such as isotonic regression and conformal prediction. This framework identifies a designspace for recalibration procedures and provides finite-sample calibration guarantees for any modelrecalibrated using these strategies. This allows practitioners to trade o↵between computational cost,meaningful likelihoods, deterministic behavior, and stronger calibration guarantees. In the second part, I turn to training-time calibration with the goal of encouraging calibrationwhile maintainingsharpness—the degree to which predictions are confident and informative.Ipropose a class of di↵erentiable calibration measures that serve as regularization objectives, enablingco-optimization of calibration and sharpness during training.These objectives encompass manypopular notions of calibration for regression and classification previously enforced only after training,instead incorporating them into standard empirical risk minimization. They also enable task-specificcalibration objectives, allowing probabilistic models whose uncertainty estimates are both statisticallycoherent and aligned with the practical needs of downstream decision-making. The third part investigates calibration under distribution shift, a central challenge in real-worlddeployments. I consider an online forecasting setting where data may evolve over time or be selectedadversarially. Building on Blackwell approachability theory, I develop a general strategy for enforcingcalibration guarantees across arbitrary observation sequences under minimal assumptions.Thisframework supports diverse calibration notions, including distribution and decision calibration, through both oracle-based and computationally tractable algorithms. I further present gradient-basedapproaches that relax the guarantees while enabling broader applicability. Empirical evaluationsdemonstrate that these methods maintain calibrated forecasts while achieving vanishing regret withrespect to expert predictors. Collectively, this dissertation provides principled strategies for uncertainty estimation withincreased flexibility by enforcing many forms of calibration at each stage of model development.This comprehensive approach to calibration across the model lifecycle enables practitioners totailor uncertainty quantification to their specific applications while reliably informing decisions inhigh-stakes settings. Acknowledgments First and foremost, I want to thank my advisor, Stefano Ermon. In addition to being one of thesmartest people I’ve had the good fortune to work with, Stefano has given me more freedom to workon anything that draws my interest than I could have reasonably hoped for. Volodymyr Kuleshovand Berk Ustun have also been great mentors throughout my PhD, being incredibly generous withtheir time and ideas. I am grateful to Emma Brunskill, Tatsu Hashimoto, Sanmi Koyejo, VasilisSyrgkanis, and Omer Reingold for their support at various stages of my PhD, including serving onmy committees and hosting thought-provoking research rotations. I would also like to thank t

点击免费查看完整报告