IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1011393.html
   My bibliography  Save this article

Scoring epidemiological forecasts on transformed scales

Author

Listed:
  • Nikos I Bosse
  • Sam Abbott
  • Anne Cori
  • Edwin van Leeuwen
  • Johannes Bracher
  • Sebastian Funk

Abstract

Forecast evaluation is essential for the development of predictive epidemic models and can inform their use for public health decision-making. Common scores to evaluate epidemiological forecasts are the Continuous Ranked Probability Score (CRPS) and the Weighted Interval Score (WIS), which can be seen as measures of the absolute distance between the forecast distribution and the observation. However, applying these scores directly to predicted and observed incidence counts may not be the most appropriate due to the exponential nature of epidemic processes and the varying magnitudes of observed values across space and time. In this paper, we argue that transforming counts before applying scores such as the CRPS or WIS can effectively mitigate these difficulties and yield epidemiologically meaningful and easily interpretable results. Using the CRPS on log-transformed values as an example, we list three attractive properties: Firstly, it can be interpreted as a probabilistic version of a relative error. Secondly, it reflects how well models predicted the time-varying epidemic growth rate. And lastly, using arguments on variance-stabilizing transformations, it can be shown that under the assumption of a quadratic mean-variance relationship, the logarithmic transformation leads to expected CRPS values which are independent of the order of magnitude of the predicted quantity. Applying a transformation of log(x + 1) to data and forecasts from the European COVID-19 Forecast Hub, we find that it changes model rankings regardless of stratification by forecast date, location or target types. Situations in which models missed the beginning of upward swings are more strongly emphasised while failing to predict a downturn following a peak is less severely penalised when scoring transformed forecasts as opposed to untransformed ones. We conclude that appropriate transformations, of which the natural logarithm is only one particularly attractive option, should be considered when assessing the performance of different models in the context of infectious disease incidence.Author summary: Scores like the Continuous Ranked Probability Score (CRPS) or the Weighted Interval Score (WIS) are commonly used to evaluate epidemiological forecasts and are a measure of absolute distance between forecast and observation. Due to the exponential nature of epidemic processes, evaluating the absolute distance between forecast and observation may not be ideal. We argue that transforming counts before applying the CRPS or WIS can yield more meaningful results. The natural logarithm is a particularly attractive transformation in epidemiological settings. Scores computed on log-transformed values can be interpreted as a probabilistic version of a relative error and reflect how well forecasters predict the time-varying epidemic growth rate. If the data-generating process has a quadratic mean-variance relationship, the logarithmic transformation also leads to expected CRPS values which are independent of the order of magnitude of the predicted quantity. We illustrate these properties using data from the European COVID-19 Forecast Hub and find that scoring transformed counts changes model rankings. Stronger emphasis is given to situations in which forecasters missed the beginning of upward swings, while failing to predict a downturn following a peak is less severely penalised. We generally recommend including evaluations of transformed counts when assessing forecaster performance.

Suggested Citation

  • Nikos I Bosse & Sam Abbott & Anne Cori & Edwin van Leeuwen & Johannes Bracher & Sebastian Funk, 2023. "Scoring epidemiological forecasts on transformed scales," PLOS Computational Biology, Public Library of Science, vol. 19(8), pages 1-23, August.
  • Handle: RePEc:plo:pcbi00:1011393
    DOI: 10.1371/journal.pcbi.1011393
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011393
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011393&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1011393?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Johannes Bracher & Evan L Ray & Tilmann Gneiting & Nicholas G Reich, 2021. "Evaluating epidemic forecasts in an interval format," PLOS Computational Biology, Public Library of Science, vol. 17(2), pages 1-15, February.
    2. R. Winkler & Javier Muñoz & José Cervera & José Bernardo & Gail Blattenberger & Joseph Kadane & Dennis Lindley & Allan Murphy & Robert Oliver & David Ríos-Insua, 1996. "Scoring rules and the evaluation of probabilities," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 5(1), pages 1-60, June.
    3. Gneiting, Tilmann & Raftery, Adrian E., 2007. "Strictly Proper Scoring Rules, Prediction, and Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 359-378, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paolo Giudici & Emanuela Raffinetti, 2025. "RGA: a unified measure of predictive accuracy," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 19(1), pages 67-93, March.
    2. Victor Richmond R. Jose & Robert F. Nau & Robert L. Winkler, 2008. "Scoring Rules, Generalized Entropy, and Utility Maximization," Operations Research, INFORMS, vol. 56(5), pages 1146-1157, October.
    3. David Kaplan & Chansoon Lee, 2018. "Optimizing Prediction Using Bayesian Model Averaging: Examples Using Large-Scale Educational Assessments," Evaluation Review, , vol. 42(4), pages 423-457, August.
    4. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios & Chen, Zhi & Gaba, Anil & Tsetlin, Ilia & Winkler, Robert L., 2022. "The M5 uncertainty competition: Results, findings and conclusions," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1365-1385.
    5. Wang, Xiaoqian & Hyndman, Rob J. & Li, Feng & Kang, Yanfei, 2023. "Forecast combinations: An over 50-year review," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1518-1547.
    6. Fabian Kruger & Hendrik Plett, 2022. "Prediction intervals for economic fixed-event forecasts," Papers 2210.13562, arXiv.org, revised Mar 2024.
    7. Yael Grushka-Cockayne & Kenneth C. Lichtendahl Jr. & Victor Richmond R. Jose & Robert L. Winkler, 2017. "Quantile Evaluation, Sensitivity to Bracketing, and Sharing Business Payoffs," Operations Research, INFORMS, vol. 65(3), pages 712-728, June.
    8. Braun, Julia & Sabanés Bové, Daniel & Held, Leonhard, 2014. "Choice of generalized linear mixed models using predictive crossvalidation," Computational Statistics & Data Analysis, Elsevier, vol. 75(C), pages 190-202.
    9. Taillardat, Maxime & Fougères, Anne-Laure & Naveau, Philippe & de Fondeville, Raphaël, 2023. "Evaluating probabilistic forecasts of extremes using continuous ranked probability score distributions," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1448-1459.
    10. Karl Schlag & James Tremewan & Joël Weele, 2015. "A penny for your thoughts: a survey of methods for eliciting beliefs," Experimental Economics, Springer;Economic Science Association, vol. 18(3), pages 457-490, September.
    11. Luis A. Barboza & Shu Wei Chou Chen & Marcela Alfaro Córdoba & Eric J. Alfaro & Hugo G. Hidalgo, 2023. "Spatio‐temporal downscaling emulator for regional climate models," Environmetrics, John Wiley & Sons, Ltd., vol. 34(7), November.
    12. Kathryn S Taylor & James W Taylor, 2022. "Interval forecasts of weekly incident and cumulative COVID-19 mortality in the United States: A comparison of combining methods," PLOS ONE, Public Library of Science, vol. 17(3), pages 1-25, March.
    13. Ray, Evan L. & Brooks, Logan C. & Bien, Jacob & Biggerstaff, Matthew & Bosse, Nikos I. & Bracher, Johannes & Cramer, Estee Y. & Funk, Sebastian & Gerding, Aaron & Johansson, Michael A. & Rumack, Aaron, 2023. "Comparing trained and untrained probabilistic ensemble forecasts of COVID-19 cases and deaths in the United States," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1366-1383.
    14. Gneiting, Tilmann, 2011. "Making and Evaluating Point Forecasts," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 746-762.
    15. Andrew Grant & David Johnstone & Oh Kang Kwon, 2019. "A Probability Scoring Rule for Simultaneous Events," Decision Analysis, INFORMS, vol. 16(4), pages 301-313, December.
    16. Gneiting, Tilmann, 2011. "Quantiles as optimal point forecasts," International Journal of Forecasting, Elsevier, vol. 27(2), pages 197-207, April.
    17. Aitazaz Ali Raja & Pierre Pinson & Jalal Kazempour & Sergio Grammatico, 2022. "A Market for Trading Forecasts: A Wagering Mechanism," Papers 2205.02668, arXiv.org, revised Oct 2022.
    18. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    19. Alexander Henzi & Johanna F Ziegel, 2022. "Valid sequential inference on probability forecast performance [A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems]," Biometrika, Biometrika Trust, vol. 109(3), pages 647-663.
    20. Makariou, Despoina & Barrieu, Pauline & Tzougas, George, 2021. "A finite mixture modelling perspective for combining experts’ opinions with an application to quantile-based risk measures," LSE Research Online Documents on Economics 110763, London School of Economics and Political Science, LSE Library.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011393. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.