IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v20y2023i5p4340-d1083579.html
   My bibliography  Save this article

Integrating Structured and Unstructured EHR Data for Predicting Mortality by Machine Learning and Latent Dirichlet Allocation Method

Author

Listed:
  • Chih-Chou Chiu

    (Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan)

  • Chung-Min Wu

    (Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan)

  • Te-Nien Chien

    (College of Management, National Taipei University of Technology, Taipei 106, Taiwan)

  • Ling-Jing Kao

    (Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan)

  • Chengcheng Li

    (College of Management, National Taipei University of Technology, Taipei 106, Taiwan)

  • Chuan-Mei Chu

    (College of Management, National Taipei University of Technology, Taipei 106, Taiwan)

Abstract

An ICU is a critical care unit that provides advanced medical support and continuous monitoring for patients with severe illnesses or injuries. Predicting the mortality rate of ICU patients can not only improve patient outcomes, but also optimize resource allocation. Many studies have attempted to create scoring systems and models that predict the mortality of ICU patients using large amounts of structured clinical data. However, unstructured clinical data recorded during patient admission, such as notes made by physicians, is often overlooked. This study used the MIMIC-III database to predict mortality in ICU patients. In the first part of the study, only eight structured variables were used, including the six basic vital signs, the GCS, and the patient’s age at admission. In the second part, unstructured predictor variables were extracted from the initial diagnosis made by physicians when the patients were admitted to the hospital and analyzed using Latent Dirichlet Allocation techniques. The structured and unstructured data were combined using machine learning methods to create a mortality risk prediction model for ICU patients. The results showed that combining structured and unstructured data improved the accuracy of the prediction of clinical outcomes in ICU patients over time. The model achieved an AUROC of 0.88, indicating accurate prediction of patient vital status. Additionally, the model was able to predict patient clinical outcomes over time, successfully identifying important variables. This study demonstrated that a small number of easily collectible structured variables, combined with unstructured data and analyzed using LDA topic modeling, can significantly improve the predictive performance of a mortality risk prediction model for ICU patients. These results suggest that initial clinical observations and diagnoses of ICU patients contain valuable information that can aid ICU medical and nursing staff in making important clinical decisions.

Suggested Citation

  • Chih-Chou Chiu & Chung-Min Wu & Te-Nien Chien & Ling-Jing Kao & Chengcheng Li & Chuan-Mei Chu, 2023. "Integrating Structured and Unstructured EHR Data for Predicting Mortality by Machine Learning and Latent Dirichlet Allocation Method," IJERPH, MDPI, vol. 20(5), pages 1-22, February.
  • Handle: RePEc:gam:jijerp:v:20:y:2023:i:5:p:4340-:d:1083579
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/20/5/4340/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/20/5/4340/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Shane Nanayakkara & Sam Fogarty & Michael Tremeer & Kelvin Ross & Brent Richards & Christoph Bergmeir & Sheng Xu & Dion Stub & Karen Smith & Mark Tacey & Danny Liew & David Pilcher & David M Kaye, 2018. "Characterising risk of in-hospital mortality following cardiac arrest using machine learning: A retrospective international registry study," PLOS Medicine, Public Library of Science, vol. 15(11), pages 1-16, November.
    2. Bin Liu & Longyun Fang & Fule Liu & Xiaolong Wang & Junjie Chen & Kuo-Chen Chou, 2015. "Identification of Real MicroRNA Precursors with a Pseudo Structure Status Composition Approach," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-20, March.
    3. P Celard & A Seara Vieira & E L Iglesias & L Borrajo, 2020. "LDA filter: A Latent Dirichlet Allocation preprocess method for Weka," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
    4. Claire Senot, 2019. "Continuity of Care and Risk of Readmission: An Investigation into the Healthcare Journey of Heart Failure Patients," Production and Operations Management, Production and Operations Management Society, vol. 28(8), pages 2008-2030, August.
    5. Liangfei Qiu & Subodha Kumar & Arun Sen & Atish P. Sinha, 2022. "Impact of the Hospital Readmission Reduction Program on hospital readmission and mortality: An economic analysis," Production and Operations Management, Production and Operations Management Society, vol. 31(5), pages 2341-2360, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhijun Yan & Lini Kuang & Liangfei Qiu, 2022. "Prosocial behaviors and economic performance: Evidence from an online mental healthcare platform," Production and Operations Management, Production and Operations Management Society, vol. 31(10), pages 3859-3876, October.
    2. Katherine Bobroske & Michael Freeman & Lawrence Huan & Anita Cattrell & Stefan Scholtes, 2022. "Curbing the Opioid Epidemic at Its Root: The Effect of Provider Discordance After Opioid Initiation," Management Science, INFORMS, vol. 68(3), pages 2003-2015, March.
    3. Xin Ma & Jing Guo & Xiao Sun, 2016. "DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues," PLOS ONE, Public Library of Science, vol. 11(12), pages 1-20, December.
    4. Wenjuan Fan & Qiqi Zhou & Liangfei Qiu & Subodha Kumar, 2023. "Should Doctors Open Online Consultation Services? An Empirical Investigation of Their Impact on Offline Appointments," Information Systems Research, INFORMS, vol. 34(2), pages 629-651, June.
    5. Liangfei Qiu & Subodha Kumar & Arun Sen & Atish P. Sinha, 2022. "Impact of the Hospital Readmission Reduction Program on hospital readmission and mortality: An economic analysis," Production and Operations Management, Production and Operations Management Society, vol. 31(5), pages 2341-2360, May.
    6. Vishal Ahuja & Carlos A. Alvarez & Bradley R. Staats, 2020. "Maintaining Continuity in Service: An Empirical Examination of Primary Care Physicians," Manufacturing & Service Operations Management, INFORMS, vol. 22(5), pages 1088-1106, September.
    7. Subodha Kumar & Liangfei Qiu & Arun Sen & Atish P. Sinha, 2022. "Putting analytics into action in care coordination research: Emerging issues and potential solutions," Production and Operations Management, Production and Operations Management Society, vol. 31(6), pages 2714-2738, June.
    8. Rajapaksha, Dilini & Bergmeir, Christoph & Hyndman, Rob J., 2023. "LoMEF: A framework to produce local explanations for global model time series forecasts," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1424-1447.
    9. Sushil Gupta & Medha Tekriwal & Carlos M. Parra, 2022. "Permeation of the term “analytics” in production and operations management research," Production and Operations Management, Production and Operations Management Society, vol. 31(10), pages 3651-3667, October.
    10. Chenyu Zhang & Jiayue Jiang & Hong Jin & Tinggui Chen, 2021. "The Impact of COVID-19 on Consumers’ Psychological Behavior Based on Data Mining for Online User Comments in the Catering Industry in China," IJERPH, MDPI, vol. 18(8), pages 1-19, April.
    11. Ghazalbash, Somayeh & Zargoush, Manaf & Mowbray, Fabrice & Costa, Andrew, 2022. "Impact of multimorbidity and frailty on adverse outcomes among older delayed discharge patients: Implications for healthcare policy," Health Policy, Elsevier, vol. 126(3), pages 197-206.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:20:y:2023:i:5:p:4340-:d:1083579. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.