IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0183250.html
   My bibliography  Save this article

Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?

Author

Listed:
  • Jin Li

Abstract

Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r2) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r2 and simulations were used to demonstrate the behaviour of r and r2 and to compare three accuracy measures under various scenarios. Relevant confusions about r and r2, has been clarified. The calculation of r and r2 is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe’s efficiency (E1) is also an alternative accuracy measure. The r and r2 do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E1 are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making.

Suggested Citation

  • Jin Li, 2017. "Assessing the accuracy of predictive models for numerical data: Not r nor r2, why not? Then what?," PLOS ONE, Public Library of Science, vol. 12(8), pages 1-16, August.
  • Handle: RePEc:plo:pone00:0183250
    DOI: 10.1371/journal.pone.0183250
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0183250
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0183250&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0183250?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shinyoung Kwag & Daegi Hahm & Minkyu Kim & Seunghyun Eem, 2020. "Development of a Probabilistic Seismic Performance Assessment Model of Slope Using Machine Learning Methods," Sustainability, MDPI, vol. 12(8), pages 1-22, April.
    2. Fort, Hugo, 2018. "On predicting species yields in multispecies communities: Quantifying the accuracy of the linear Lotka-Volterra generalized model," Ecological Modelling, Elsevier, vol. 387(C), pages 154-162.
    3. Fort, Hugo, 2020. "Making quantitative predictions on the yield of a species immersed in a multispecies community: The focal species method," Ecological Modelling, Elsevier, vol. 430(C).
    4. Ritabrata Roy & Mrinmoy Majumder, 2022. "Assessment of water quality trends in Deepor Beel, Assam, India," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 24(12), pages 14327-14347, December.
    5. Siwei Li & Jingjing An & Yaqiu Li & Xiagu Zhu & Dongdong Zhao & Lixian Wang & Yonghui Sun & Yuanzhao Yang & Changhao Bi & Xueli Zhang & Meng Wang, 2022. "Automated high-throughput genome editing platform with an AI learning in situ prediction model," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    6. Rufino, Marta M. & Albouy, Camille & Brind'Amour, Anik, 2021. "Which spatial interpolators I should use? A case study applying to marine species," Ecological Modelling, Elsevier, vol. 449(C).
    7. Daniel S. Maynard & Lalasia Bialic-Murphy & Constantin M. Zohner & Colin Averill & Johan Hoogen & Haozhi Ma & Lidong Mo & Gabriel Reuben Smith & Alicia T. R. Acosta & Isabelle Aubin & Erika Berenguer , 2022. "Global relationships in tree functional traits," Nature Communications, Nature, vol. 13(1), pages 1-12, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0183250. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.