Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations

My bibliography Save this article

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations

Author

Listed:

Emily J MacKay
Michael D Stubna
Corey Chivers
Michael E Draugelis
William J Hanson
Nimesh D Desai
Peter W Groeneveld

Registered:

Abstract

Objective: This study aimed to develop and validate a claims-based, machine learning algorithm to predict clinical outcomes across both medical and surgical patient populations. Methods: This retrospective, observational cohort study, used a random 5% sample of 770,777 fee-for-service Medicare beneficiaries with an inpatient hospitalization between 2009–2011. The machine learning algorithms tested included: support vector machine, random forest, multilayer perceptron, extreme gradient boosted tree, and logistic regression. The extreme gradient boosted tree algorithm outperformed the alternatives and was the machine learning method used for the final risk model. Primary outcome was 30-day mortality. Secondary outcomes were: rehospitalization, and any of 23 adverse clinical events occurring within 30 days of the index admission date. Results: The machine learning algorithm performance was evaluated by both the area under the receiver operating curve (AUROC) and Brier Score. The risk model demonstrated high performance for prediction of: 30-day mortality (AUROC = 0.88; Brier Score = 0.06), and 17 of the 23 adverse events (AUROC range: 0.80–0.86; Brier Score range: 0.01–0.05). The risk model demonstrated moderate performance for prediction of: rehospitalization within 30 days (AUROC = 0.73; Brier Score: = 0.07) and six of the 23 adverse events (AUROC range: 0.74–0.79; Brier Score range: 0.01–0.02). The machine learning risk model performed comparably on a second, independent validation dataset, confirming that the risk model was not overfit. Conclusions and relevance: We have developed and validated a robust, claims-based, machine learning risk model that is applicable to both medical and surgical patient populations and demonstrates comparable predictive accuracy to existing risk models.

Suggested Citation

Emily J MacKay & Michael D Stubna & Corey Chivers & Michael E Draugelis & William J Hanson & Nimesh D Desai & Peter W Groeneveld, 2021. "Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-14, June.

Handle: RePEc:plo:pone00:0252585
DOI: 10.1371/journal.pone.0252585

Download full text from publisher

References listed on IDEAS

Stephen F Weng & Jenna Reps & Joe Kai & Jonathan M Garibaldi & Nadeem Qureshi, 2017. "Can machine-learning improve cardiovascular risk prediction using routine clinical data?," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-14, April.
N/A, 1995. "The Covariance Decomposition of the Probability Score and Its Use in Evaluating Prognostic Estimates," Medical Decision Making, , vol. 15(2), pages 120-131, June.
Andre Esteva & Brett Kuprel & Roberto A. Novoa & Justin Ko & Susan M. Swetter & Helen M. Blau & Sebastian Thrun, 2017. "Dermatologist-level classification of skin cancer with deep neural networks," Nature, Nature, vol. 542(7639), pages 115-118, February.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Renu Sabharwal & Shah J. Miah & Samuel Fosso Wamba, 2025. "Extending artificial intelligence research in the clinical domain: a theoretical perspective," Annals of Operations Research, Springer, vol. 348(3), pages 1713-1744, May.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Stephen F Weng & Luis Vaz & Nadeem Qureshi & Joe Kai, 2019. "Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches," PLOS ONE, Public Library of Science, vol. 14(3), pages 1-22, March.
Majd Oteibi & Adam Tamimi & Kaneez Abbas & Gabriel Tamimi & Danesh Khazaei & Hadi Khazaei, 2024. "Advancing Digital Health using AI and Machine Learning Solutions for Early Ultrasonic Detection of Breast Disorders in Women," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(11), pages 518-527, November.
Riccardo Zanardelli, 2025. "The human-machine paradox: how collaboration creates or destroys value, and why augmentation is key to resolving it," Papers 2509.14057, arXiv.org, revised Nov 2025.
Lin Lu & Laurent Dercle & Binsheng Zhao & Lawrence H. Schwartz, 2021. "Deep learning for the prediction of early on-treatment response in metastatic colorectal cancer from serial medical imaging," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
Salvatore Tedesco & Martina Andrulli & Markus Åkerlund Larsson & Daniel Kelly & Antti Alamäki & Suzanne Timmons & John Barton & Joan Condell & Brendan O’Flynn & Anna Nordström, 2021. "Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults," IJERPH, MDPI, vol. 18(23), pages 1-18, December.
Zheng Yan & Wenqian Robertson & Yaosheng Lou & Tom W. Robertson & Sung Yong Park, 2021. "Finding leading scholars in mobile phone behavior: a mixed-method analysis of an emerging interdisciplinary field," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9499-9517, December.
Freddy Gabbay & Rotem Lev Aharoni & Ori Schweitzer, 2022. "Deep Neural Network Memory Performance and Throughput Modeling and Simulation Framework," Mathematics, MDPI, vol. 10(21), pages 1-20, November.
Ting Wang & Boyang Zang & Chui Kong & Yigang Li & Xiaomin Yang & Yi Yu, 2025. "Intelligent and precise auxiliary diagnosis of breast tumors using deep learning and radiomics," PLOS ONE, Public Library of Science, vol. 20(6), pages 1-11, June.
Sonika Darshan, 2024. "Data Mining for Disease Diagnosis: A Review of Machine Learning Approaches in Healthcare," Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023, Open Knowledge, vol. 6(1), pages 716-726.
Gang Yu & Kai Sun & Chao Xu & Xing-Hua Shi & Chong Wu & Ting Xie & Run-Qi Meng & Xiang-He Meng & Kuan-Song Wang & Hong-Mei Xiao & Hong-Wen Deng, 2021. "Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
Yue Sun & Songmin Dai & Jide Li & Yin Zhang & Xiaoqiang Li, 2019. "Tooth-Marked Tongue Recognition Using Gradient-Weighted Class Activation Maps," Future Internet, MDPI, vol. 11(2), pages 1-12, February.
DonHee Lee & Seong No Yoon, 2021. "Application of Artificial Intelligence-Based Technologies in the Healthcare Industry: Opportunities and Challenges," IJERPH, MDPI, vol. 18(1), pages 1-18, January.
Wenjuan Fan & Jingnan Liu & Shuwan Zhu & Panos M. Pardalos, 2020. "Investigating the impacting factors for the healthcare professionals to adopt artificial intelligence-based medical diagnosis support system (AIMDSS)," Annals of Operations Research, Springer, vol. 294(1), pages 567-592, November.
Shang Li & Fei Yu & Shankou Zhang & Huige Yin & Hairong Lin, 2025. "Optimization of Direct Convolution Algorithms on ARM Processors for Deep Learning Inference," Mathematics, MDPI, vol. 13(5), pages 1-19, February.
Young Jae Kim & Seung Seog Han & Hee Joo Yang & Sung Eun Chang, 2020. "Prospective, comparative evaluation of a deep neural network and dermoscopy in the diagnosis of onychomycosis," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-9, June.
Dario Sipari & Betsy D. M. Chaparro-Rico & Daniele Cafolla, 2022. "SANE (Easy Gait Analysis System): Towards an AI-Assisted Automatic Gait-Analysis," IJERPH, MDPI, vol. 19(16), pages 1-27, August.
N Salet & A Gökdemir & J Preijde & C H van Heck & F Eijkenaar, 2024. "Using machine learning to predict acute myocardial infarction and ischemic heart disease in primary care cardiovascular patients," PLOS ONE, Public Library of Science, vol. 19(7), pages 1-17, July.
Darko B. Vuković & Senanu Dekpo-Adza & Stefana Matović, 2025. "AI integration in financial services: a systematic review of trends and regulatory challenges," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 12(1), pages 1-29, December.
Ying Wang & Zhicheng Du & Wayne R. Lawrence & Yun Huang & Yu Deng & Yuantao Hao, 2019. "Predicting Hepatitis B Virus Infection Based on Health Examination Data of Community Population," IJERPH, MDPI, vol. 16(23), pages 1-13, December.
Mara Giavina-Bianchi & Raquel Machado de Sousa & Vitor Zago de Almeida Paciello & William Gois Vitor & Aline Lissa Okita & Renata Prôa & Gian Lucca dos Santos Severino & Anderson Alves Schinaid & Rafa, 2021. "Implementation of artificial intelligence algorithms for melanoma screening in a primary care setting," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-13, September.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0252585. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data