IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2512.00916.html

An Imbalance-Robust Evaluation Framework for Extreme Risk Forecasts

Author

Listed:
  • Sotirios D. Nikolopoulos

Abstract

Evaluating rare-event forecasts is challenging because standard metrics collapse as event prevalence declines. Measures such as F1-score, AUPRC, MCC, and accuracy induce degenerate thresholds -- converging to zero or one -- and their values become dominated by class imbalance rather than tail discrimination. We develop a family of rare-event-stable (RES) metrics whose optimal thresholds remain strictly interior as the event probability approaches zero, ensuring coherent decision rules under extreme rarity. Simulations spanning event probabilities from 0.01 down to one in a million show that RES metrics maintain stable thresholds, consistent model rankings, and near-complete prevalence invariance, whereas traditional metrics exhibit statistically significant threshold drift and structural collapse. A credit-default application confirms these results: RES metrics yield interpretable probability-of-default cutoffs (4-9%) and remain robust under subsampling, while classical metrics fail operationally. The RES framework provides a principled, prevalence-invariant basis for evaluating extreme-risk forecasts.

Suggested Citation

  • Sotirios D. Nikolopoulos, 2025. "An Imbalance-Robust Evaluation Framework for Extreme Risk Forecasts," Papers 2512.00916, arXiv.org.
  • Handle: RePEc:arx:papers:2512.00916
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2512.00916
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Takaya Saito & Marc Rehmsmeier, 2015. "The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-21, March.
    2. Christian Brownlees & Robert F. Engle, 2017. "SRISK: A Conditional Capital Shortfall Measure of Systemic Risk," The Review of Financial Studies, Society for Financial Studies, vol. 30(1), pages 48-79.
    3. King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
    4. Enzo D’Innocenzo & André Lucas & Bernd Schwaab & Xin Zhang, 2024. "Modeling Extreme Events: Time-Varying Extreme Tail Shape," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(3), pages 903-917, July.
    5. Diks, Cees & Panchenko, Valentyn & van Dijk, Dick, 2011. "Likelihood-based scoring rules for comparing density forecasts in tails," Journal of Econometrics, Elsevier, vol. 163(2), pages 215-230, August.
    6. Makridakis, Spyros & Spiliotis, Evangelos & Hollyman, Ross & Petropoulos, Fotios & Swanson, Norman & Gaba, Anil, 2025. "The M6 forecasting competition: Bridging the gap between forecasting and investment decisions," International Journal of Forecasting, Elsevier, vol. 41(4), pages 1315-1354.
    7. Candelon, Bertrand & Dumitrescu, Elena-Ivona & Hurlin, Christophe, 2014. "Currency crisis early warning systems: Why they should be dynamic," International Journal of Forecasting, Elsevier, vol. 30(4), pages 1016-1029.
    8. Antunes, António & Bonfim, Diana & Monteiro, Nuno & Rodrigues, Paulo M.M., 2018. "Forecasting banking crises with dynamic panel probit models," International Journal of Forecasting, Elsevier, vol. 34(2), pages 249-275.
    9. Gneiting, Tilmann & Raftery, Adrian E., 2007. "Strictly Proper Scoring Rules, Prediction, and Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 359-378, March.
    10. Taleb, Nassim Nicholas & Bar-Yam, Yaneer & Cirillo, Pasquale, 2022. "On single point forecasts for fat-tailed variables," International Journal of Forecasting, Elsevier, vol. 38(2), pages 413-422.
    11. Ludivia Hernandez Aros & Luisa Ximena Bustamante Molano & Fernando Gutierrez-Portela & John Johver Moreno Hernandez & Mario Samuel Rodríguez Barrero, 2024. "Financial fraud detection through the application of machine learning techniques: a literature review," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 11(1), pages 1-22, December.
    12. Lahiri, Kajal & Wang, J. George, 2013. "Evaluating probability forecasts for GDP declines using alternative methodologies," International Journal of Forecasting, Elsevier, vol. 29(1), pages 175-190.
    13. repec:hal:journl:peer-00834423 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lahiri, Kajal & Yang, Liu, 2013. "Forecasting Binary Outcomes," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 1025-1106, Elsevier.
    2. Allaj, Erindi & Sanfelici, Simona, 2023. "Early Warning Systems for identifying financial instability," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1777-1803.
    3. Antunes, António & Bonfim, Diana & Monteiro, Nuno & Rodrigues, Paulo M.M., 2018. "Forecasting banking crises with dynamic panel probit models," International Journal of Forecasting, Elsevier, vol. 34(2), pages 249-275.
    4. Geoff Kenny & Thomas Kostka & Federico Masera, 2015. "Density characteristics and density forecast performance: a panel analysis," Empirical Economics, Springer, vol. 48(3), pages 1203-1231, May.
    5. Arijit Paladhi, 2025. "Predicting news deserts using supervised machine learning," Journal of Computational Social Science, Springer, vol. 8(2), pages 1-29, May.
    6. Luisa Bisaglia & Matteo Grigoletto, 2021. "A new time-varying model for forecasting long-memory series," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 139-155, March.
    7. Kapetanios, G. & Mitchell, J. & Price, S. & Fawcett, N., 2015. "Generalised density forecast combinations," Journal of Econometrics, Elsevier, vol. 188(1), pages 150-165.
    8. Yuru Sun & Worapree Maneesoonthorn & Ruben Loaiza-Maya & Gael M. Martin, 2023. "Optimal probabilistic forecasts for risk management," Papers 2303.01651, arXiv.org.
    9. Ruben Loaiza‐Maya & Gael M. Martin & David T. Frazier, 2021. "Focused Bayesian prediction," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 36(5), pages 517-543, August.
    10. Wang, Xiaoqian & Hyndman, Rob J. & Li, Feng & Kang, Yanfei, 2023. "Forecast combinations: An over 50-year review," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1518-1547.
    11. Elena Andreou & Andros Kourtellos, 2018. "Scoring rules for simple forecasting models: The case of Cyprus GDP and its sectors," Cyprus Economic Policy Review, University of Cyprus, Economics Research Centre, vol. 12(1), pages 59-73, June.
    12. Martin, Gael M. & Loaiza-Maya, Rubén & Maneesoonthorn, Worapree & Frazier, David T. & Ramírez-Hassan, Andrés, 2022. "Optimal probabilistic forecasts: When do they work?," International Journal of Forecasting, Elsevier, vol. 38(1), pages 384-406.
    13. Jean-Baptiste Hasse, 2022. "Systemic risk: a network approach," Empirical Economics, Springer, vol. 63(1), pages 313-344, July.
    14. Weerasinghe, Chaya & Loaiza-Maya, Rubén & Martin, Gael M. & Frazier, David T., 2025. "ABC-based forecasting in misspecified state space models," International Journal of Forecasting, Elsevier, vol. 41(1), pages 270-289.
    15. Taillardat, Maxime & Fougères, Anne-Laure & Naveau, Philippe & de Fondeville, Raphaël, 2023. "Evaluating probabilistic forecasts of extremes using continuous ranked probability score distributions," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1448-1459.
    16. David T. Frazier & Ruben Loaiza-Maya & Gael M. Martin, 2021. "Variational Bayes in State Space Models: Inferential and Predictive Accuracy," Papers 2106.12262, arXiv.org, revised Feb 2022.
    17. Ramon de Punder & Timo Dimitriadis & Rutger-Jan Lange, 2024. "Kullback-Leibler-based characterizations of score-driven updates," Tinbergen Institute Discussion Papers 24-051/III, Tinbergen Institute, revised 22 Oct 2024.
    18. Jie Cheng, 2024. "Evaluating Density Forecasts Using Weighted Multivariate Scores in a Risk Management Context," Computational Economics, Springer;Society for Computational Economics, vol. 64(6), pages 3617-3643, December.
    19. Kuangyu Wen & Wenbin Wu & Ximing Wu, 2023. "Electricity demand forecasting and risk management using Gaussian process model with error propagation," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(4), pages 957-969, July.
    20. Peter McAdam & Anders Warne, 2024. "Density forecast combinations: The real‐time dimension," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(5), pages 1153-1172, August.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2512.00916. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.