IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0214365.html
   My bibliography  Save this article

Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches

Author

Listed:
  • Stephen F Weng
  • Luis Vaz
  • Nadeem Qureshi
  • Joe Kai

Abstract

Background: Prognostic modelling using standard methods is well-established, particularly for predicting risk of single diseases. Machine-learning may offer potential to explore outcomes of even greater complexity, such as premature death. This study aimed to develop novel prediction algorithms using machine-learning, in addition to standard survival modelling, to predict premature all-cause mortality. Methods: A prospective population cohort of 502,628 participants aged 40–69 years were recruited to the UK Biobank from 2006–2010 and followed-up until 2016. Participants were assessed on a range of demographic, biometric, clinical and lifestyle factors. Mortality data by ICD-10 were obtained from linkage to Office of National Statistics. Models were developed using deep learning, random forest and Cox regression. Calibration was assessed by comparing observed to predicted risks; and discrimination by area under the ‘receiver operating curve’ (AUC). Findings: 14,418 deaths (2.9%) occurred over a total follow-up time of 3,508,454 person-years. A simple age and gender Cox model was the least predictive (AUC 0.689, 95% CI 0.681–0.699). A multivariate Cox regression model significantly improved discrimination by 6.2% (AUC 0.751, 95% CI 0.748–0.767). The application of machine-learning algorithms further improved discrimination by 3.2% using random forest (AUC 0.783, 95% CI 0.776–0.791) and 3.9% using deep learning (AUC 0.790, 95% CI 0.783–0.797). These ML algorithms improved discrimination by 9.4% and 10.1% respectively from a simple age and gender Cox regression model. Random forest and deep learning achieved similar levels of discrimination with no significant difference. Machine-learning algorithms were well-calibrated, while Cox regression models consistently over-predicted risk. Conclusions: Machine-learning significantly improved accuracy of prediction of premature all-cause mortality in this middle-aged population, compared to standard methods. This study illustrates the value of machine-learning for risk prediction within a traditional epidemiological study design, and how this approach might be reported to assist scientific verification.

Suggested Citation

  • Stephen F Weng & Luis Vaz & Nadeem Qureshi & Joe Kai, 2019. "Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches," PLOS ONE, Public Library of Science, vol. 14(3), pages 1-22, March.
  • Handle: RePEc:plo:pone00:0214365
    DOI: 10.1371/journal.pone.0214365
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0214365
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0214365&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0214365?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stephen F Weng & Jenna Reps & Joe Kai & Jonathan M Garibaldi & Nadeem Qureshi, 2017. "Can machine-learning improve cardiovascular risk prediction using routine clinical data?," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-14, April.
    2. Andre Esteva & Brett Kuprel & Roberto A. Novoa & Justin Ko & Susan M. Swetter & Helen M. Blau & Sebastian Thrun, 2017. "Dermatologist-level classification of skin cancer with deep neural networks," Nature, Nature, vol. 542(7639), pages 115-118, February.
    3. Kun-Hsing Yu & Ce Zhang & Gerald J. Berry & Russ B. Altman & Christopher Ré & Daniel L. Rubin & Michael Snyder, 2016. "Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features," Nature Communications, Nature, vol. 7(1), pages 1-10, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. George Papantonopoulos & Chryssa Delatola & Keiso Takahashi & Marja L Laine & Bruno G Loos, 2019. "Hidden noise in immunologic parameters might explain rapid progression in early-onset periodontitis," PLOS ONE, Public Library of Science, vol. 14(11), pages 1-14, November.
    2. Salvatore Tedesco & Martina Andrulli & Markus Åkerlund Larsson & Daniel Kelly & Antti Alamäki & Suzanne Timmons & John Barton & Joan Condell & Brendan O’Flynn & Anna Nordström, 2021. "Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults," IJERPH, MDPI, vol. 18(23), pages 1-18, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Emily J MacKay & Michael D Stubna & Corey Chivers & Michael E Draugelis & William J Hanson & Nimesh D Desai & Peter W Groeneveld, 2021. "Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-14, June.
    2. Majd Oteibi & Adam Tamimi & Kaneez Abbas & Gabriel Tamimi & Danesh Khazaei & Hadi Khazaei, 2024. "Advancing Digital Health using AI and Machine Learning Solutions for Early Ultrasonic Detection of Breast Disorders in Women," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(11), pages 518-527, November.
    3. Riccardo Zanardelli, 2025. "The human-machine paradox: how collaboration creates or destroys value, and why augmentation is key to resolving it," Papers 2509.14057, arXiv.org, revised Nov 2025.
    4. Lin Lu & Laurent Dercle & Binsheng Zhao & Lawrence H. Schwartz, 2021. "Deep learning for the prediction of early on-treatment response in metastatic colorectal cancer from serial medical imaging," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    5. Salvatore Tedesco & Martina Andrulli & Markus Åkerlund Larsson & Daniel Kelly & Antti Alamäki & Suzanne Timmons & John Barton & Joan Condell & Brendan O’Flynn & Anna Nordström, 2021. "Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults," IJERPH, MDPI, vol. 18(23), pages 1-18, December.
    6. Zheng Yan & Wenqian Robertson & Yaosheng Lou & Tom W. Robertson & Sung Yong Park, 2021. "Finding leading scholars in mobile phone behavior: a mixed-method analysis of an emerging interdisciplinary field," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9499-9517, December.
    7. Freddy Gabbay & Rotem Lev Aharoni & Ori Schweitzer, 2022. "Deep Neural Network Memory Performance and Throughput Modeling and Simulation Framework," Mathematics, MDPI, vol. 10(21), pages 1-20, November.
    8. Ting Wang & Boyang Zang & Chui Kong & Yigang Li & Xiaomin Yang & Yi Yu, 2025. "Intelligent and precise auxiliary diagnosis of breast tumors using deep learning and radiomics," PLOS ONE, Public Library of Science, vol. 20(6), pages 1-11, June.
    9. Sonika Darshan, 2024. "Data Mining for Disease Diagnosis: A Review of Machine Learning Approaches in Healthcare," Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023, Open Knowledge, vol. 6(1), pages 716-726.
    10. Gang Yu & Kai Sun & Chao Xu & Xing-Hua Shi & Chong Wu & Ting Xie & Run-Qi Meng & Xiang-He Meng & Kuan-Song Wang & Hong-Mei Xiao & Hong-Wen Deng, 2021. "Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    11. Yue Sun & Songmin Dai & Jide Li & Yin Zhang & Xiaoqiang Li, 2019. "Tooth-Marked Tongue Recognition Using Gradient-Weighted Class Activation Maps," Future Internet, MDPI, vol. 11(2), pages 1-12, February.
    12. DonHee Lee & Seong No Yoon, 2021. "Application of Artificial Intelligence-Based Technologies in the Healthcare Industry: Opportunities and Challenges," IJERPH, MDPI, vol. 18(1), pages 1-18, January.
    13. Wenjuan Fan & Jingnan Liu & Shuwan Zhu & Panos M. Pardalos, 2020. "Investigating the impacting factors for the healthcare professionals to adopt artificial intelligence-based medical diagnosis support system (AIMDSS)," Annals of Operations Research, Springer, vol. 294(1), pages 567-592, November.
    14. Shang Li & Fei Yu & Shankou Zhang & Huige Yin & Hairong Lin, 2025. "Optimization of Direct Convolution Algorithms on ARM Processors for Deep Learning Inference," Mathematics, MDPI, vol. 13(5), pages 1-19, February.
    15. Young Jae Kim & Seung Seog Han & Hee Joo Yang & Sung Eun Chang, 2020. "Prospective, comparative evaluation of a deep neural network and dermoscopy in the diagnosis of onychomycosis," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-9, June.
    16. Dario Sipari & Betsy D. M. Chaparro-Rico & Daniele Cafolla, 2022. "SANE (Easy Gait Analysis System): Towards an AI-Assisted Automatic Gait-Analysis," IJERPH, MDPI, vol. 19(16), pages 1-27, August.
    17. N Salet & A Gökdemir & J Preijde & C H van Heck & F Eijkenaar, 2024. "Using machine learning to predict acute myocardial infarction and ischemic heart disease in primary care cardiovascular patients," PLOS ONE, Public Library of Science, vol. 19(7), pages 1-17, July.
    18. Darko B. Vuković & Senanu Dekpo-Adza & Stefana Matović, 2025. "AI integration in financial services: a systematic review of trends and regulatory challenges," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 12(1), pages 1-29, December.
    19. Ying Wang & Zhicheng Du & Wayne R. Lawrence & Yun Huang & Yu Deng & Yuantao Hao, 2019. "Predicting Hepatitis B Virus Infection Based on Health Examination Data of Community Population," IJERPH, MDPI, vol. 16(23), pages 1-13, December.
    20. Mara Giavina-Bianchi & Raquel Machado de Sousa & Vitor Zago de Almeida Paciello & William Gois Vitor & Aline Lissa Okita & Renata Prôa & Gian Lucca dos Santos Severino & Anderson Alves Schinaid & Rafa, 2021. "Implementation of artificial intelligence algorithms for melanoma screening in a primary care setting," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-13, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0214365. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.