IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v50y2025i1p40-67.html

Towards Optimal Problem Dependent Generalization Error Bounds in Statistical Learning Theory

Author

Listed:
  • Yunbei Xu

    (Decision, Risk, and Operations Division, Graduate School of Business, Columbia University, New York, New York 10027)

  • Assaf Zeevi

    (Decision, Risk, and Operations Division, Graduate School of Business, Columbia University, New York, New York 10027)

Abstract

We study problem-dependent rates, that is, generalization errors that scale near-optimally with the variance, effective loss, or gradient norms evaluated at the “best hypothesis.” We introduce a principled framework dubbed “uniform localized convergence” and characterize sharp problem-dependent rates for central statistical learning problems. From a methodological viewpoint, our framework resolves several fundamental limitations of existing uniform convergence and localization analysis approaches. It also provides improvements and some level of unification in the study of localized complexities, one-sided uniform inequalities, and sample-based iterative algorithms. In the so-called “slow rate” regime, we provide the first (moment-penalized) estimator that achieves the optimal variance-dependent rate for general “rich” classes; we also establish an improved loss-dependent rate for standard empirical risk minimization. In the “fast rate” regime, we establish finite-sample, problem-dependent bounds that are comparable to precise asymptotics. In addition, we show that iterative algorithms such as gradient descent and first order expectation maximization can achieve optimal generalization error in several representative problems across the areas of nonconvex learning, stochastic optimization, and learning with missing data.

Suggested Citation

  • Yunbei Xu & Assaf Zeevi, 2025. "Towards Optimal Problem Dependent Generalization Error Bounds in Statistical Learning Theory," Mathematics of Operations Research, INFORMS, vol. 50(1), pages 40-67, February.
  • Handle: RePEc:inm:ormoor:v:50:y:2025:i:1:p:40-67
    DOI: 10.1287/moor.2021.0076
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/moor.2021.0076
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2021.0076?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Athey, Susan & Wager, Stefan, 2017. "Efficient Policy Learning," Research Papers 3506, Stanford University, Graduate School of Business.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xinkun Nie & Stefan Wager, 2017. "Quasi-Oracle Estimation of Heterogeneous Treatment Effects," Papers 1712.04912, arXiv.org, revised Aug 2020.
    2. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    3. Zhengyuan Zhou & Susan Athey & Stefan Wager, 2023. "Offline Multi-Action Policy Learning: Generalization and Optimization," Operations Research, INFORMS, vol. 71(1), pages 148-183, January.
    4. Anders Bredahl Kock & Martin Thyrsgaard, 2017. "Optimal sequential treatment allocation," Papers 1705.09952, arXiv.org, revised Aug 2018.
    5. Susan Athey & Raj Chetty & Guido Imbens, 2020. "Using Experiments to Correct for Selection in Observational Studies," Papers 2006.09676, arXiv.org, revised May 2025.
    6. Carlos Fern'andez-Lor'ia & Foster Provost & Jesse Anderton & Benjamin Carterette & Praveen Chandar, 2020. "A Comparison of Methods for Treatment Assignment with an Application to Playlist Generation," Papers 2004.11532, arXiv.org, revised Apr 2022.
    7. Crystal T. Nguyen & Daniel J. Luckett & Anna R. Kahkoska & Grace E. Shearrer & Donna Spruijt‐Metz & Jaimie N. Davis & Michael R. Kosorok, 2020. "Estimating individualized treatment regimes from crossover designs," Biometrics, The International Biometric Society, vol. 76(3), pages 778-788, September.
    8. Kock, Anders Bredahl & Preinerstorfer, David & Veliyev, Bezirgen, 2023. "Treatment recommendation with distributional targets," Journal of Econometrics, Elsevier, vol. 234(2), pages 624-646.
    9. Masahiro Kato & Shota Yasui & Kenichiro McAlinn, 2020. "The Adaptive Doubly Robust Estimator for Policy Evaluation in Adaptive Experiments and a Paradox Concerning Logging Policy," Papers 2010.03792, arXiv.org, revised Jun 2021.
    10. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    11. Daniel Jacob, 2021. "CATE meets ML -- The Conditional Average Treatment Effect and Machine Learning," Papers 2104.09935, arXiv.org, revised Apr 2021.
    12. Haupt, Johannes & Jacob, Daniel & Gubela, Robin M. & Lessmann, Stefan, 2019. "Affordable Uplift: Supervised Randomization in Controlled Exprtiments," IRTG 1792 Discussion Papers 2019-026, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    13. Maria Dimakopoulou & Zhimei Ren & Zhengyuan Zhou, 2021. "Online Multi-Armed Bandits with Adaptive Inference," Papers 2102.13202, arXiv.org, revised Jun 2021.
    14. Jann Spiess & Vasilis Syrgkanis & Victor Yaneng Wang, 2021. "Finding Subgroups with Significant Treatment Effects," Papers 2103.07066, arXiv.org, revised Dec 2023.
    15. Daniel Jacob, 2021. "CATE meets ML," Digital Finance, Springer, vol. 3(2), pages 99-148, June.
    16. Nian Si & Fan Zhang & Zhengyuan Zhou & Jose Blanchet, 2023. "Distributionally Robust Batch Contextual Bandits," Management Science, INFORMS, vol. 69(10), pages 5772-5793, October.
    17. Nathan Kallus, 2022. "Treatment Effect Risk: Bounds and Inference," Papers 2201.05893, arXiv.org, revised Jul 2022.
    18. Marianne Bertrand & Bruno Crépon & Alicia Marguerie & Patrick Premand, 2021. "Do Workfare Programs Live Up to Their Promises? Experimental Evidence from Cote D’Ivoire," NBER Working Papers 28664, National Bureau of Economic Research, Inc.
    19. Caio Waisman & Harikesh S. Nair & Carlos Carrion, 2025. "Online Causal Inference for Advertising in Real-Time Bidding Auctions," Marketing Science, INFORMS, vol. 44(1), pages 176-195, January.
    20. Andrew Bennett & Nathan Kallus, 2020. "Efficient Policy Learning from Surrogate-Loss Classification Reductions," Papers 2002.05153, arXiv.org.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;

    JEL classification:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:50:y:2025:i:1:p:40-67. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.