IDEAS home Printed from https://ideas.repec.org/a/sae/medema/v35y2015i2p162-169.html
   My bibliography  Save this article

Calibration of Risk Prediction Models

Author

Listed:
  • Ben Van Calster
  • Andrew J. Vickers

Abstract

Decision-analytic measures to assess clinical utility of prediction models and diagnostic tests incorporate the relative clinical consequences of true and false positives without the need for external information such as monetary costs. Net Benefit is a commonly used metric that weights the relative consequences in terms of the risk threshold at which a patient would opt for treatment. Theoretical results demonstrate that clinical utility is affected by a model’;s calibration, the extent to which estimated risks correspond to observed event rates. We analyzed the effects of different types of miscalibration on Net Benefit and investigated whether and under what circumstances miscalibration can make a model clinically harmful. Clinical harm is defined as a lower Net Benefit compared with classifying all patients as positive or negative by default. We used simulated data to investigate the effect of overestimation, underestimation, overfitting (estimated risks too extreme), and underfitting (estimated risks too close to baseline risk) on Net Benefit for different choices of the risk threshold. In accordance with theory, we observed that miscalibration always reduced Net Benefit. Harm was sometimes observed when models underestimated risk at a threshold below the event rate (as in underestimation and overfitting) or overestimated risk at a threshold above event rate (as in overestimation and overfitting). Underfitting never resulted in a harmful model. The impact of miscalibration decreased with increasing discrimination. Net Benefit was less sensitive to miscalibration for risk thresholds close to the event rate than for other thresholds. We illustrate these findings with examples from the literature and with a case study on testicular cancer diagnosis. Our findings strengthen the importance of obtaining calibrated risk models.

Suggested Citation

  • Ben Van Calster & Andrew J. Vickers, 2015. "Calibration of Risk Prediction Models," Medical Decision Making, , vol. 35(2), pages 162-169, February.
  • Handle: RePEc:sae:medema:v:35:y:2015:i:2:p:162-169
    DOI: 10.1177/0272989X14547233
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/0272989X14547233
    Download Restriction: no

    File URL: https://libkey.io/10.1177/0272989X14547233?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Walter Bouwmeester & Nicolaas P A Zuithoff & Susan Mallett & Mirjam I Geerlings & Yvonne Vergouwe & Ewout W Steyerberg & Douglas G Altman & Karel G M Moons, 2012. "Reporting and Methods in Clinical Prediction Research: A Systematic Review," PLOS Medicine, Public Library of Science, vol. 9(5), pages 1-13, May.
    2. Stuart G. Baker & Nancy R. Cook & Andrew Vickers & Barnett S. Kramer, 2009. "Using relative utility curves to evaluate risk prediction," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 172(4), pages 729-748, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wenjuan Wang & Martin Kiik & Niels Peek & Vasa Curcin & Iain J Marshall & Anthony G Rudd & Yanzhong Wang & Abdel Douiri & Charles D Wolfe & Benjamin Bray, 2020. "A systematic review of machine learning models for predicting outcomes of stroke with structured data," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-16, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ben Van Calster & Ewout W. Steyerberg & Ralph B. D’Agostino Sr & Michael J. Pencina, 2014. "Sensitivity and Specificity Can Change in Opposite Directions When New Predictive Markers Are Added to Risk Models," Medical Decision Making, , vol. 34(4), pages 513-522, May.
    2. Thomas P A Debray & Karel G M Moons & Ghada Mohammed Abdallah Abo-Zaid & Hendrik Koffijberg & Richard David Riley, 2013. "Individual Participant Data Meta-Analysis for a Binary Outcome: One-Stage or Two-Stage?," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-10, April.
    3. Ben Van Calster & Andrew J. Vickers & Michael J. Pencina & Stuart G. Baker & Dirk Timmerman & Ewout W. Steyerberg, 2013. "Evaluation of Markers and Risk Prediction Models," Medical Decision Making, , vol. 33(4), pages 490-501, May.
    4. Helder Novais Bastos & Nuno S Osório & António Gil Castro & Angélica Ramos & Teresa Carvalho & Leonor Meira & David Araújo & Leonor Almeida & Rita Boaventura & Patrícia Fragata & Catarina Chaves & Pat, 2016. "A Prediction Rule to Stratify Mortality Risk of Patients with Pulmonary Tuberculosis," PLOS ONE, Public Library of Science, vol. 11(9), pages 1-14, September.
    5. Paul Bach & Christine Wallisch & Nadja Klein & Lorena Hafermann & Willi Sauerbrei & Ewout W Steyerberg & Georg Heinze & Geraldine Rauch & for topic group 2 of the STRATOS initiative, 2020. "Systematic review of education and practical guidance on regression modeling for medical researchers who lack a strong statistical background: Study protocol," PLOS ONE, Public Library of Science, vol. 15(12), pages 1-10, December.
    6. Tracey L. Marsh & Holly Janes & Margaret S. Pepe, 2020. "Statistical inference for net benefit measures in biomarker validation studies," Biometrics, The International Biometric Society, vol. 76(3), pages 843-852, September.
    7. Shi, Chengchun & Lu, Wenbin & Song, Rui, 2019. "A sparse random projection-based test for overall qualitative treatment effects," LSE Research Online Documents on Economics 102107, London School of Economics and Political Science, LSE Library.
    8. Michael Lebenbaum & Osvaldo Espin-Garcia & Yi Li & Laura C Rosella, 2018. "Development and validation of a population based risk algorithm for obesity: The Obesity Population Risk Tool (OPoRT)," PLOS ONE, Public Library of Science, vol. 13(1), pages 1-11, January.
    9. Hormuzd A. Katki & Ionut Bebu, 2021. "A simple framework to identify optimal cost‐effective risk thresholds for a single screen: Comparison to Decision Curve Analysis," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(3), pages 887-903, July.
    10. Mei-Cheng Wang & Shanshan Li, 2012. "Bivariate Marker Measurements and ROC Analysis," Biometrics, The International Biometric Society, vol. 68(4), pages 1207-1218, December.
    11. Holly Janes & Margaret S. Pepe & Ying Huang, 2014. "A Framework for Evaluating Markers Used to Select Patient Treatment," Medical Decision Making, , vol. 34(2), pages 159-167, February.
    12. Baker Stuart G. & Van Calster Ben & Steyerberg Ewout W., 2012. "Evaluating a New Marker for Risk Prediction Using the Test Tradeoff: An Update," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-37, March.
    13. Igor O Korolev & Laura L Symonds & Andrea C Bozoki & Alzheimer's Disease Neuroimaging Initiative, 2016. "Predicting Progression from Mild Cognitive Impairment to Alzheimer's Dementia Using Clinical, MRI, and Plasma Biomarkers via Probabilistic Pattern Classification," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-25, February.
    14. Taro Takeshima & Yosuke Yamamoto & Yoshinori Noguchi & Nobuyuki Maki & Koichiro Gibo & Yukio Tsugihashi & Asako Doi & Shingo Fukuma & Shin Yamazaki & Eiji Kajii & Shunichi Fukuhara, 2016. "Identifying Patients with Bacteremia in Community-Hospital Emergency Rooms: A Retrospective Cohort Study," PLOS ONE, Public Library of Science, vol. 11(3), pages 1-17, March.
    15. Marta Morales-Puerto & María Ruiz-Díaz & Marta Aranda-Gallardo & José Miguel Morales-Asencio & Purificación Alcalá-Gutiérrez & José Antonio Rodríguez-Montalvo & Álvaro León-Campos & Silvia García-Mayo, 2022. "Development of a Clinical Prediction Rule for Adverse Events in Multimorbid Patients in Emergency and Hospitalisation," IJERPH, MDPI, vol. 19(14), pages 1-14, July.
    16. Ying Huang & Eric Laber, 2016. "Personalized Evaluation of Biomarker Value: A Cost-Benefit Perspective," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 8(1), pages 43-65, June.
    17. John H Wasson & Lynn Ho & Laura Soloway & L Gordon Moore, 2018. "Validation of the What Matters Index: A brief, patient-reported index that guides care for chronic conditions and can substitute for computer-generated risk models," PLOS ONE, Public Library of Science, vol. 13(2), pages 1-13, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:medema:v:35:y:2015:i:2:p:162-169. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.