IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-64658-7.html
   My bibliography  Save this article

Selecting fitted models under epistemic uncertainty using a stochastic process on quantile functions

Author

Listed:
  • Alexandre René

    (Physik
    Chair of Computational Network Science
    Department of Physics)

  • André Longtin

    (Department of Physics
    Department of Cellular and Molecular Medicine
    Center for Neural Dynamics)

Abstract

Fitting models to data is an important part of the practice of science. Advances in machine learning have made it possible to fit more—and more complex—models, but have also exacerbated a problem: when multiple models fit the data equally well, which one(s) should we pick? The answer depends entirely on the modelling goal. In the scientific context, the essential goal is replicability: if a model works well to describe one experiment, it should continue to do so when that experiment is replicated tomorrow, or in another laboratory. The selection criterion must therefore be robust to the variations inherent to the replication process. In this work we develop a nonparametric method for estimating uncertainty on a model’s empirical risk when replications are non-stationary, thus ensuring that a model is only rejected when another is reproducibly better. We illustrate the method with two examples: one a more classical setting, where the models are structurally distinct, and a machine learning-inspired setting, where they differ only in the value of their parameters. We show how, in this context of replicability or “epistemic uncertainty”, it compares favourably to existing model selection criteria, and has more satisfactory behaviour with large experimental datasets.

Suggested Citation

  • Alexandre René & André Longtin, 2025. "Selecting fitted models under epistemic uncertainty using a stochastic process on quantile functions," Nature Communications, Nature, vol. 16(1), pages 1-25, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64658-7
    DOI: 10.1038/s41467-025-64658-7
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-64658-7
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-64658-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sébastien Van Bellegem & Rainer Dahlhaus, 2006. "Semiparametric estimation by model selection for locally stationary processes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(5), pages 721-746, November.
    2. Jinchi Lv & Jun S. Liu, 2014. "Model selection principles in misspecified models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 141-167, January.
    3. David Findley, 1991. "Counterexamples to parsimony and BIC," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 43(3), pages 505-514, September.
    4. David J. Spiegelhalter & Nicola G. Best & Bradley P. Carlin & Angelika Van Der Linde, 2002. "Bayesian measures of model complexity and fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 583-639, October.
    5. R. Golden, 2003. "Discrepancy Risk Model Selection Test theory for comparing possibly misspecified or nonnested models," Psychometrika, Springer;The Psychometric Society, vol. 68(2), pages 229-249, June.
    6. Gneiting, Tilmann & Raftery, Adrian E., 2007. "Strictly Proper Scoring Rules, Prediction, and Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 359-378, March.
    7. David J. Spiegelhalter & Nicola G. Best & Bradley P. Carlin & Angelika Linde, 2014. "The deviance information criterion: 12 years on," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(3), pages 485-493, June.
    8. Marc C. Kennedy & Anthony O'Hagan, 2001. "Bayesian calibration of computer models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(3), pages 425-464.
    9. De Schuymer, B. & De Meyer, H. & De Baets, B., 2005. "Cycle-transitive comparison of independent random variables," Journal of Multivariate Analysis, Elsevier, vol. 96(2), pages 352-373, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fabian Krüger & Sebastian Lerch & Thordis Thorarinsdottir & Tilmann Gneiting, 2021. "Predictive Inference Based on Markov Chain Monte Carlo Output," International Statistical Review, International Statistical Institute, vol. 89(2), pages 274-301, August.
    2. Rubio, F.J. & Steel, M.F.J., 2011. "Inference for grouped data with a truncated skew-Laplace distribution," Computational Statistics & Data Analysis, Elsevier, vol. 55(12), pages 3218-3231, December.
    3. Kai Yang & Qingqing Zhang & Xinyang Yu & Xiaogang Dong, 2023. "Bayesian inference for a mixture double autoregressive model," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 77(2), pages 188-207, May.
    4. Bassetti, Federico & De Giuli, Maria Elena & Nicolino, Enrica & Tarantola, Claudia, 2018. "Multivariate dependence analysis via tree copula models: An application to one-year forward energy contracts," European Journal of Operational Research, Elsevier, vol. 269(3), pages 1107-1121.
    5. Papastamoulis, Panagiotis, 2018. "Overfitting Bayesian mixtures of factor analyzers with an unknown number of components," Computational Statistics & Data Analysis, Elsevier, vol. 124(C), pages 220-234.
    6. Constandina Koki & Loukia Meligkotsidou & Ioannis Vrontos, 2020. "Forecasting under model uncertainty: Non‐homogeneous hidden Markov models with Pòlya‐Gamma data augmentation," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(4), pages 580-598, July.
    7. Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
    8. repec:plo:pone00:0150180 is not listed on IDEAS
    9. David Kaplan & Chansoon Lee, 2018. "Optimizing Prediction Using Bayesian Model Averaging: Examples Using Large-Scale Educational Assessments," Evaluation Review, , vol. 42(4), pages 423-457, August.
    10. White, Staci A. & Herbei, Radu, 2015. "A Monte Carlo approach to quantifying model error in Bayesian parameter estimation," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 168-181.
    11. Muhammed Semakula & Franco̧is Niragire & Christel Faes, 2020. "Bayesian spatio-temporal modeling of malaria risk in Rwanda," PLOS ONE, Public Library of Science, vol. 15(9), pages 1-16, September.
    12. Angelo Moretti, 2023. "Estimation of small area proportions under a bivariate logistic mixed model," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3663-3684, August.
    13. Carlos Díaz-Avalos & Pablo Juan & Somnath Chaudhuri & Marc Sáez & Laura Serra, 2020. "Association between the New COVID-19 Cases and Air Pollution with Meteorological Elements in Nine Counties of New York State," IJERPH, MDPI, vol. 17(23), pages 1-18, December.
    14. Yang, Kai & Yu, Xinyang & Zhang, Qingqing & Dong, Xiaogang, 2022. "On MCMC sampling in self-exciting integer-valued threshold time series models," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    15. Yaojun Zhang & Lanpeng Ji & Georgios Aivaliotis & Charles Taylor, 2023. "Bayesian CART models for insurance claims frequency," Papers 2303.01923, arXiv.org, revised Dec 2023.
    16. Smith, Michael Stanley, 2015. "Copula modelling of dependence in multivariate time series," International Journal of Forecasting, Elsevier, vol. 31(3), pages 815-833.
    17. Pedro Saramago & Karl Claxton & Nicky J. Welton & Marta Soares, 2020. "Bayesian econometric modelling of observational data for cost‐effectiveness analysis: establishing the value of negative pressure wound therapy in the healing of open surgical wounds," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(4), pages 1575-1593, October.
    18. Oludare Ariyo & Emmanuel Lesaffre & Geert Verbeke & Adrian Quintero, 2022. "Bayesian Model Selection for Longitudinal Count Data," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(2), pages 516-547, November.
    19. Keunseo Kim & Hyojoong Kim & Vinnam Kim & Heeyoung Kim, 2020. "A Multiscale Spatially Varying Coefficient Model for Regional Analysis of Topsoil Geochemistry," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 25(1), pages 74-89, March.
    20. Chen, Yewen & Chang, Xiaohui & Luo, Fangzhi & Huang, Hui, 2023. "Additive dynamic models for correcting numerical model outputs," Computational Statistics & Data Analysis, Elsevier, vol. 187(C).
    21. Rodrigues, E.C. & Assunção, R., 2012. "Bayesian spatial models with a mixture neighborhood structure," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 88-102.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64658-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.