AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning

My bibliography Save this article

AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning

Author

Listed:

Fergus Imrie
Bogdan Cebere
Eoin F McKinney
Mihaela van der Schaar

Registered:

Abstract

Diagnostic and prognostic models are increasingly important in medicine and inform many clinical decisions. Recently, machine learning approaches have shown improvement over conventional modeling techniques by better capturing complex interactions between patient covariates in a data-driven manner. However, the use of machine learning introduces technical and practical challenges that have thus far restricted widespread adoption of such techniques in clinical settings. To address these challenges and empower healthcare professionals, we present an open-source machine learning framework, AutoPrognosis 2.0, to facilitate the development of diagnostic and prognostic models. AutoPrognosis leverages state-of-the-art advances in automated machine learning to develop optimized machine learning pipelines, incorporates model explainability tools, and enables deployment of clinical demonstrators, without requiring significant technical expertise. To demonstrate AutoPrognosis 2.0, we provide an illustrative application where we construct a prognostic risk score for diabetes using the UK Biobank, a prospective study of 502,467 individuals. The models produced by our automated framework achieve greater discrimination for diabetes than expert clinical risk scores. We have implemented our risk score as a web-based decision support tool, which can be publicly accessed by patients and clinicians. By open-sourcing our framework as a tool for the community, we aim to provide clinicians and other medical practitioners with an accessible resource to develop new risk scores, personalized diagnostics, and prognostics using machine learning techniques.Software: https://github.com/vanderschaarlab/AutoPrognosisAuthor summary: Previous studies have reported promising applications of machine learning (ML) approaches in healthcare. However, there remain significant challenges to using ML for diagnostic and prognostic modeling, particularly for non-ML experts, that currently prevent broader adoption of these approaches. We developed an open-source tool, AutoPrognosis 2.0, to address these challenges and make modern statistical and machine learning methods available to expert and non-expert ML users. AutoPrognosis configures and optimizes ML pipelines using automated machine learning to develop powerful predictive models, while also providing interpretability methods to allow users to understand and debug these models. This study illustrates the application of AutoPrognosis to diabetes risk prediction using data from UK Biobank. The risk score developed using AutoPrognosis outperforms existing risk scores and has been implemented as a web-based decision support tool that can be publicly accessed by patients and clinicians. This study suggests that AutoPrognosis 2.0 can be used by healthcare experts to create new clinical tools and predictive pipelines across various clinical outcomes, employing advanced machine learning techniques.

Suggested Citation

Fergus Imrie & Bogdan Cebere & Eoin F McKinney & Mihaela van der Schaar, 2023. "AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning," PLOS Digital Health, Public Library of Science, vol. 2(6), pages 1-21, June.

Handle: RePEc:plo:pdig00:0000276
DOI: 10.1371/journal.pdig.0000276

Download full text from publisher

References listed on IDEAS

Andrew J. Vickers & Elena B. Elkin, 2006. "Decision Curve Analysis: A Novel Method for Evaluating Prediction Models," Medical Decision Making, , vol. 26(6), pages 565-574, November.
van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
Marc-Andre Schulz & B. T. Thomas Yeo & Joshua T. Vogelstein & Janaina Mourao-Miranada & Jakob N. Kather & Konrad Kording & Blake Richards & Danilo Bzdok, 2020. "Different scaling of linear models and deep learning in UKBiobank brain images versus machine-learning datasets," Nature Communications, Nature, vol. 11(1), pages 1-15, December.
Vickers, Andrew J, 2008. "Decision Analysis for the Evaluation of Diagnostic Tests, Prediction Models, and Molecular Markers," The American Statistician, American Statistical Association, vol. 62(4), pages 314-320.
repec:plo:pmed00:1001779 is not listed on IDEAS
Crone, Sven F. & Lessmann, Stefan & Stahlbock, Robert, 2006. "The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing," European Journal of Operational Research, Elsevier, vol. 173(3), pages 781-800, September.
Ahmed M Alaa & Thomas Bolton & Emanuele Di Angelantonio & James H F Rudd & Mihaela van der Schaar, 2019. "Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants," PLOS ONE, Public Library of Science, vol. 14(5), pages 1-17, May.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Rawan Omar & Sooyun Caroline Tavolacci & Lathan Liou & Dillan F Villavisanis & Yoav Y Broza & Hossam Haick, 2024. "Real-time prognostic biomarkers for predicting in-hospital mortality and cardiac complications in COVID-19 patients," PLOS Global Public Health, Public Library of Science, vol. 4(3), pages 1-17, March.
Dexin Chen & Meiting Fu & Liangjie Chi & Liyan Lin & Jiaxin Cheng & Weisong Xue & Chenyan Long & Wei Jiang & Xiaoyu Dong & Jian Sui & Dajia Lin & Jianping Lu & Shuangmu Zhuo & Side Liu & Guoxin Li & G, 2022. "Prognostic and predictive value of a pathomics signature in gastric cancer," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
Alex Thompson & Scott Devine & Mike Kattan & Andrew Muir, 2014. "Prediction of Treatment Week Eight Response & Sustained Virologic Response in Patients Treated with Boceprevir Plus Peginterferon Alfa and Ribavirin," PLOS ONE, Public Library of Science, vol. 9(8), pages 1-8, August.
Tracey L. Marsh & Holly Janes & Margaret S. Pepe, 2020. "Statistical inference for net benefit measures in biomarker validation studies," Biometrics, The International Biometric Society, vol. 76(3), pages 843-852, September.
Tae Yoon Lee & Paul Gustafson & Mohsen Sadatsafavi, 2023. "Closed-Form Solution of the Unit Normal Loss Integral in 2 Dimensions, with Application in Value-of-Information Analysis," Medical Decision Making, , vol. 43(5), pages 621-626, July.
Baker Stuart G. & Van Calster Ben & Steyerberg Ewout W., 2012. "Evaluating a New Marker for Risk Prediction Using the Test Tradeoff: An Update," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-37, March.
Kevin Sandeman & Juho T Eineluoto & Joona Pohjonen & Andrew Erickson & Tuomas P Kilpeläinen & Petrus Järvinen & Henrikki Santti & Anssi Petas & Mika Matikainen & Suvi Marjasuo & Anu Kenttämies & Tuoma, 2020. "Prostate MRI added to CAPRA, MSKCC and Partin cancer nomograms significantly enhances the prediction of adverse findings and biochemical recurrence after radical prostatectomy," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-14, July.
Todd J. Levy & Kevin Coppa & Jinxuan Cang & Douglas P. Barnaby & Marc D. Paradis & Stuart L. Cohen & Alex Makhnevich & David Klaveren & David M. Kent & Karina W. Davidson & Jamie S. Hirsch & Theodoros, 2022. "Development and validation of self-monitoring auto-updating prognostic models of survival for hospitalized COVID-19 patients," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
Jérôme Allyn & Cyril Ferdynus & Michel Bohrer & Cécile Dalban & Dorothée Valance & Nicolas Allou, 2016. "Simplified Acute Physiology Score II as Predictor of Mortality in Intensive Care Units: A Decision Curve Analysis," PLOS ONE, Public Library of Science, vol. 11(10), pages 1-11, October.
Noémi Kreif & Richard Grieve & Iván Díaz & David Harrison, 2015. "Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury," Health Economics, John Wiley & Sons, Ltd., vol. 24(9), pages 1213-1228, September.
Abhilash Bandam & Eedris Busari & Chloi Syranidou & Jochen Linssen & Detlef Stolten, 2022. "Classification of Building Types in Germany: A Data-Driven Modeling Approach," Data, MDPI, vol. 7(4), pages 1-23, April.
Boonstra Philip S. & Little Roderick J.A. & West Brady T. & Andridge Rebecca R. & Alvarado-Leiton Fernanda, 2021. "A Simulation Study of Diagnostics for Selection Bias," Journal of Official Statistics, Sciendo, vol. 37(3), pages 751-769, September.
Lin Lin & Rachel L Spreng & Kelly E Seaton & S Moses Dennison & Lindsay C Dahora & Daniel J Schuster & Sheetal Sawant & Peter B Gilbert & Youyi Fong & Neville Kisalu & Andrew J Pollard & Georgia D Tom, 2024. "GeM-LR: Discovering predictive biomarkers for small datasets in vaccine studies," PLOS Computational Biology, Public Library of Science, vol. 20(11), pages 1-23, November.
Ja Hyeon Ku & Myong Kim & Seok-Soo Byun & Hyeon Jeong & Cheol Kwak & Hyeon Hoe Kim & Sang Eun Lee, 2015. "External Validation of Models for Prediction of Lymph Node Metastasis in Urothelial Carcinoma of the Bladder," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-10, October.
Christopher J Greenwood & George J Youssef & Primrose Letcher & Jacqui A Macdonald & Lauryn J Hagg & Ann Sanson & Jenn Mcintosh & Delyse M Hutchinson & John W Toumbourou & Matthew Fuller-Tyszkiewicz &, 2020. "A comparison of penalised regression methods for informing the selection of predictive markers," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
Liangyuan Hu & Lihua Li, 2022. "Using Tree-Based Machine Learning for Health Studies: Literature Review and Case Series," IJERPH, MDPI, vol. 19(23), pages 1-13, December.
Norah Alyabs & Sy Han Chiou, 2022. "The Missing Indicator Approach for Accelerated Failure Time Model with Covariates Subject to Limits of Detection," Stats, MDPI, vol. 5(2), pages 1-13, May.
Marek Šedivý, 2023. "Mortality shocks and household consumption: the case of Mexico," Review of Economics of the Household, Springer, vol. 21(4), pages 1289-1358, December.
- Marek Sedivy, 2020. "Mortality Shocks and Household Consumption: The Case of Mexico," Working Papers IES 2020/22, Charles University Prague, Faculty of Social Sciences, Institute of Economic Studies, revised Aug 2020.
Burnett, J. Wesley & Lacombe, Donald J. & Wallander, Steven, . "Spatial and Temporal Spillovers in US Cropland Values," Journal of Agricultural and Resource Economics, Western Agricultural Economics Association, vol. 49(01).
- Burnett, J. Wesley & Lacombe, David A. & Wallander, Steven, . "Corrigendum to “Spatial and Temporal Spillovers in US Cropland Values”," Journal of Agricultural and Resource Economics, Western Agricultural Economics Association, vol. 49(2).
Félix L. Morales & Feihong Xu & Hyojun Ada Lee & Helio Tejedor Navarro & Meagan A. Bechel & Eryn L. Cameron & Jesse Kelso & Curtis H. Weiss & Luís A. Nunes Amaral, 2025. "Open-source computational pipeline flags instances of acute respiratory distress syndrome in mechanically ventilated adult patients," Nature Communications, Nature, vol. 16(1), pages 1-17, December.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000276. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data