IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0192360.html
   My bibliography  Save this article

Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives

Author

Listed:
  • Sebastian Gehrmann
  • Franck Dernoncourt
  • Yeran Li
  • Eric T Carlson
  • Joy T Wu
  • Jonathan Welt
  • John Foote Jr.
  • Edward T Moseley
  • David W Grant
  • Patrick D Tyler
  • Leo A Celi

Abstract

In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.

Suggested Citation

  • Sebastian Gehrmann & Franck Dernoncourt & Yeran Li & Eric T Carlson & Joy T Wu & Jonathan Welt & John Foote Jr. & Edward T Moseley & David W Grant & Patrick D Tyler & Leo A Celi, 2018. "Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives," PLOS ONE, Public Library of Science, vol. 13(2), pages 1-19, February.
  • Handle: RePEc:plo:pone00:0192360
    DOI: 10.1371/journal.pone.0192360
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0192360
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0192360&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0192360?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Andre Esteva & Brett Kuprel & Roberto A. Novoa & Justin Ko & Susan M. Swetter & Helen M. Blau & Sebastian Thrun, 2017. "Dermatologist-level classification of skin cancer with deep neural networks," Nature, Nature, vol. 542(7639), pages 115-118, February.
    2. Joshua C. Denny & Neesha N. Choma & Josh F. Peterson & Randolph A. Miller & Lisa Bastarache & Ming Li & Neeraja B. Peterson, 2012. "Natural Language Processing Improves Identification of Colorectal Cancer Testing in the Electronic Medical Record," Medical Decision Making, , vol. 32(1), pages 188-197, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Seong‐H. Lee & Yanyuan Ma & Ying Wei & Jinbo Chen, 2023. "Optimal sampling for positive only electronic health record data," Biometrics, The International Biometric Society, vol. 79(4), pages 2974-2986, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lin Lu & Laurent Dercle & Binsheng Zhao & Lawrence H. Schwartz, 2021. "Deep learning for the prediction of early on-treatment response in metastatic colorectal cancer from serial medical imaging," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    2. Zheng Yan & Wenqian Robertson & Yaosheng Lou & Tom W. Robertson & Sung Yong Park, 2021. "Finding leading scholars in mobile phone behavior: a mixed-method analysis of an emerging interdisciplinary field," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9499-9517, December.
    3. Freddy Gabbay & Rotem Lev Aharoni & Ori Schweitzer, 2022. "Deep Neural Network Memory Performance and Throughput Modeling and Simulation Framework," Mathematics, MDPI, vol. 10(21), pages 1-20, November.
    4. Jungyoon Kim & Jihye Lim, 2021. "A Deep Neural Network-Based Method for Prediction of Dementia Using Big Data," IJERPH, MDPI, vol. 18(10), pages 1-13, May.
    5. Gang Yu & Kai Sun & Chao Xu & Xing-Hua Shi & Chong Wu & Ting Xie & Run-Qi Meng & Xiang-He Meng & Kuan-Song Wang & Hong-Mei Xiao & Hong-Wen Deng, 2021. "Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    6. Yue Sun & Songmin Dai & Jide Li & Yin Zhang & Xiaoqiang Li, 2019. "Tooth-Marked Tongue Recognition Using Gradient-Weighted Class Activation Maps," Future Internet, MDPI, vol. 11(2), pages 1-12, February.
    7. DonHee Lee & Seong No Yoon, 2021. "Application of Artificial Intelligence-Based Technologies in the Healthcare Industry: Opportunities and Challenges," IJERPH, MDPI, vol. 18(1), pages 1-18, January.
    8. Wenjuan Fan & Jingnan Liu & Shuwan Zhu & Panos M. Pardalos, 2020. "Investigating the impacting factors for the healthcare professionals to adopt artificial intelligence-based medical diagnosis support system (AIMDSS)," Annals of Operations Research, Springer, vol. 294(1), pages 567-592, November.
    9. Young Jae Kim & Seung Seog Han & Hee Joo Yang & Sung Eun Chang, 2020. "Prospective, comparative evaluation of a deep neural network and dermoscopy in the diagnosis of onychomycosis," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-9, June.
    10. Claus Zippel & Sabine Bohnet-Joschko, 2021. "Rise of Clinical Studies in the Field of Machine Learning: A Review of Data Registered in ClinicalTrials.gov," IJERPH, MDPI, vol. 18(10), pages 1-14, May.
    11. Dario Sipari & Betsy D. M. Chaparro-Rico & Daniele Cafolla, 2022. "SANE (Easy Gait Analysis System): Towards an AI-Assisted Automatic Gait-Analysis," IJERPH, MDPI, vol. 19(16), pages 1-27, August.
    12. Mara Giavina-Bianchi & Raquel Machado de Sousa & Vitor Zago de Almeida Paciello & William Gois Vitor & Aline Lissa Okita & Renata Prôa & Gian Lucca dos Santos Severino & Anderson Alves Schinaid & Rafa, 2021. "Implementation of artificial intelligence algorithms for melanoma screening in a primary care setting," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-13, September.
    13. Jamil Ahmad & Abdul Khader Jilani Saudagar & Khalid Mahmood Malik & Waseem Ahmad & Muhammad Badruddin Khan & Mozaherul Hoque Abul Hasanat & Abdullah AlTameem & Mohammed AlKhathami & Muhammad Sajjad, 2022. "Disease Progression Detection via Deep Sequence Learning of Successive Radiographic Scans," IJERPH, MDPI, vol. 19(1), pages 1-16, January.
    14. Rasheed Omobolaji Alabi & Alhadi Almangush & Mohammed Elmusrati & Ilmo Leivo & Antti Mäkitie, 2022. "Measuring the Usability and Quality of Explanations of a Machine Learning Web-Based Tool for Oral Tongue Cancer Prognostication," IJERPH, MDPI, vol. 19(14), pages 1-13, July.
    15. Jordi Munoz-Muriedas, 2021. "Large scale meta-analysis of preclinical toxicity data for target characterisation and hypotheses generation," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-22, June.
    16. Magdalena K Sobol & Sarah A Finkelstein, 2018. "Predictive pollen-based biome modeling using machine learning," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-29, August.
    17. Andreas Fügener & Jörn Grahl & Alok Gupta & Wolfgang Ketter, 2022. "Cognitive Challenges in Human–Artificial Intelligence Collaboration: Investigating the Path Toward Productive Delegation," Information Systems Research, INFORMS, vol. 33(2), pages 678-696, June.
    18. Vidhya V. & Anjan Gudigar & U. Raghavendra & Ajay Hegde & Girish R. Menon & Filippo Molinari & Edward J. Ciaccio & U. Rajendra Acharya, 2021. "Automated Detection and Screening of Traumatic Brain Injury (TBI) Using Computed Tomography Images: A Comprehensive Review and Future Perspectives," IJERPH, MDPI, vol. 18(12), pages 1-29, June.
    19. Pujin Wang & Jianzhuang Xiao & Ken’ichi Kawaguchi & Lichen Wang, 2022. "Automatic Ceiling Damage Detection in Large-Span Structures Based on Computer Vision and Deep Learning," Sustainability, MDPI, vol. 14(6), pages 1-24, March.
    20. Xu Gong & Keqin Guan & Qiyang Chen, 2022. "The role of textual analysis in oil futures price forecasting based on machine learning approach," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 42(10), pages 1987-2017, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0192360. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.