IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007990.html
   My bibliography  Save this article

Using information theory to optimise epidemic models for real-time prediction and estimation

Author

Listed:
  • Kris V Parag
  • Christl A Donnelly

Abstract

The effective reproduction number, Rt, is a key time-varying prognostic for the growth rate of any infectious disease epidemic. Significant changes in Rt can forewarn about new transmissions within a population or predict the efficacy of interventions. Inferring Rt reliably and in real-time from observed time-series of infected (demographic) data is an important problem in population dynamics. The renewal or branching process model is a popular solution that has been applied to Ebola and Zika virus disease outbreaks, among others, and is currently being used to investigate the ongoing COVID-19 pandemic. This model estimates Rt using a heuristically chosen piecewise function. While this facilitates real-time detection of statistically significant Rt changes, inference is highly sensitive to the function choice. Improperly chosen piecewise models might ignore meaningful changes or over-interpret noise-induced ones, yet produce visually reasonable estimates. No principled piecewise selection scheme exists. We develop a practical yet rigorous scheme using the accumulated prediction error (APE) metric from information theory, which deems the model capable of describing the observed data using the fewest bits as most justified. We derive exact posterior prediction distributions for infected population size and integrate these within an APE framework to obtain an exact and reliable method for identifying the piecewise function best supported by available epidemic data. We find that this choice optimises short-term prediction accuracy and can rapidly detect salient fluctuations in Rt, and hence the infected population growth rate, in real-time over the course of an unfolding epidemic. Moreover, we emphasise the need for formal selection by exposing how common heuristic choices, which seem sensible, can be misleading. Our APE-based method is easily computed and broadly applicable to statistically similar models found in phylogenetics and macroevolution, for example. Our results explore the relationships among estimate precision, forecast reliability and model complexity.Author summary: Understanding how the population of infected individuals (which may be humans, animals or plants) fluctuates in size over the course of an epidemic is an important problem in epidemiology and ecology. The effective reproduction number, R, provides an intuitive and useful way of describing these fluctuations by characterising the growth rate of the infected population. An R > 1 signifies a burgeoning epidemic whereas R

Suggested Citation

  • Kris V Parag & Christl A Donnelly, 2020. "Using information theory to optimise epidemic models for real-time prediction and estimation," PLOS Computational Biology, Public Library of Science, vol. 16(7), pages 1-20, July.
  • Handle: RePEc:plo:pcbi00:1007990
    DOI: 10.1371/journal.pcbi.1007990
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007990
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007990&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007990?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter D. Grünwald, 2007. "The Minimum Description Length Principle," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262072815, December.
    2. Sebastian Funk & Anton Camacho & Adam J Kucharski & Rachel Lowe & Rosalind M Eggo & W John Edmunds, 2019. "Assessing the performance of real-time epidemic forecasts: A case study of Ebola in the Western Area region of Sierra Leone, 2014-15," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-17, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sophia Beckett Velez, 2021. "Idiosyncratic Viral Loss Theory: Systemic Operational Losses in Banks," JRFM, MDPI, vol. 14(2), pages 1-13, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Neuwald Andrew F., 2014. "Protein domain hierarchy Gibbs sampling strategies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(4), pages 1-21, August.
    2. Junyi Lu & Sebastian Meyer, 2020. "Forecasting Flu Activity in the United States: Benchmarking an Endemic-Epidemic Beta Model," IJERPH, MDPI, vol. 17(4), pages 1-13, February.
    3. Das Ujjwal & Ebrahimi Nader, 2018. "A New Method For Covariate Selection In Cox Model," Statistics in Transition New Series, Polish Statistical Association, vol. 19(2), pages 297-314, June.
    4. Zelaya Mendizábal, Valentina & Boullé, Marc & Rossi, Fabrice, 2023. "Fast and fully-automated histograms for large-scale data sets," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    5. Ujjwal Das & Nader Ebrahimi, 2018. "A New Method For Covariate Selection In Cox Model," Statistics in Transition New Series, Polish Statistical Association, vol. 19(2), pages 297-314, June.
    6. Emily S Nightingale & Lloyd A C Chapman & Sridhar Srikantiah & Swaminathan Subramanian & Purushothaman Jambulingam & Johannes Bracher & Mary M Cameron & Graham F Medley, 2020. "A spatio-temporal approach to short-term prediction of visceral leishmaniasis diagnoses in India," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 14(7), pages 1-21, July.
    7. Coughlan de Perez, Erin & Stephens, Elisabeth & van Aalst, Maarten & Bazo, Juan & Fournier-Tombs, Eleonore & Funk, Sebastian & Hess, Jeremy J. & Ranger, Nicola & Lowe, Rachel, 2022. "Epidemiological versus meteorological forecasts: Best practice for linking models to policymaking," International Journal of Forecasting, Elsevier, vol. 38(2), pages 521-526.
    8. Yurij L. Katchanov & Natalia A. Shmatko, 2014. "Complexity-Based Modeling of Scientific Capital: An Outline of Mathematical Theory," International Journal of Mathematics and Mathematical Sciences, Hindawi, vol. 2014, pages 1-10, October.
    9. Mullins, Joshua & Mahadevan, Sankaran, 2014. "Variable-fidelity model selection for stochastic simulation," Reliability Engineering and System Safety, Elsevier, vol. 131(C), pages 40-52.
    10. K. Vela Velupillai, 2010. "The Algorithmic Revolution in the Social Sciences: Mathematical Economics, Game Theory and Statistical Inference," ASSRU Discussion Papers 1005, ASSRU - Algorithmic Social Science Research Unit.
    11. Andrew F Neuwald & Stephen F Altschul, 2016. "Bayesian Top-Down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties," PLOS Computational Biology, Public Library of Science, vol. 12(5), pages 1-21, May.
    12. Alperen Bektas & Valentino Piana & René Schumann, 2021. "A meso-level empirical validation approach for agent-based computational economic models drawing on micro-data: a use case with a mobility mode-choice model," SN Business & Economics, Springer, vol. 1(6), pages 1-25, June.
    13. Neuwald Andrew F., 2011. "Surveying the Manifold Divergence of an Entire Protein Class for Statistical Clues to Underlying Biochemical Mechanisms," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-30, August.
    14. Löcherbach, Eva & Orlandi, Enza, 2011. "Neighborhood radius estimation for variable-neighborhood random fields," Stochastic Processes and their Applications, Elsevier, vol. 121(9), pages 2151-2185, September.
    15. Vittoria Bruni & Michela Tartaglione & Domenico Vitulano, 2020. "A Signal Complexity-Based Approach for AM–FM Signal Modes Counting," Mathematics, MDPI, vol. 8(12), pages 1-33, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007990. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.