Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists

Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists

Author

Listed:

Pranav Rajpurkar
Jeremy Irvin
Robyn L Ball
Kaylie Zhu
Brandon Yang
Hershel Mehta
Tony Duan
Daisy Ding
Aarti Bagul
Curtis P Langlotz
Bhavik N Patel
Kristen W Yeom
Katie Shpanskaya
Francis G Blankenberg
Jayne Seekins
Timothy J Amrhein
David A Mong
Safwan S Halabi
Evan J Zucker
Andrew Y Ng
Matthew P Lungren

Abstract

Background: Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists. Methods and findings: We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt’s discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4–28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863–0.910), 0.911 (95% CI 0.866–0.947), and 0.985 (95% CI 0.974–0.991), respectively, whereas CheXNeXt’s AUCs were 0.831 (95% CI 0.790–0.870), 0.704 (95% CI 0.567–0.833), and 0.851 (95% CI 0.785–0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825–0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777–0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution. Conclusions: In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics. In their study, Pranav Rajpurkar and colleagues test a deep learning algorithm that classifies clinically important abnormalities in chest radiographs.Why was this study done?: What did the researchers do and find?: What do these findings mean?:

Suggested Citation

Pranav Rajpurkar & Jeremy Irvin & Robyn L Ball & Kaylie Zhu & Brandon Yang & Hershel Mehta & Tony Duan & Daisy Ding & Aarti Bagul & Curtis P Langlotz & Bhavik N Patel & Kristen W Yeom & Katie Shpanska, 2018. "Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists," PLOS Medicine, Public Library of Science, vol. 15(11), pages 1-17, November.

Handle: RePEc:plo:pmed00:1002686
DOI: 10.1371/journal.pmed.1002686

Download full text from publisher

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Shashank Shetty & Ananthanarayana V S. & Ajit Mahale, 2022. "MS-CheXNet: An Explainable and Lightweight Multi-Scale Dilated Network with Depthwise Separable Convolution for Prediction of Pulmonary Abnormalities in Chest Radiographs," Mathematics, MDPI, vol. 10(19), pages 1-29, October.
Eric Engle & Andrei Gabrielian & Alyssa Long & Darrell E Hurt & Alex Rosenthal, 2020. "Performance of Qure.ai automatic classifiers against a large annotated database of patients with diverse forms of tuberculosis," PLOS ONE, Public Library of Science, vol. 15(1), pages 1-19, January.
repec:plo:pmed00:1002721 is not listed on IDEAS
Oded Rotem & Tamar Schwartz & Ron Maor & Yishay Tauber & Maya Tsarfati Shapiro & Marcos Meseguer & Daniella Gilboa & Daniel S. Seidman & Assaf Zaritsky, 2024. "Visual interpretability of image-based classification models by generative latent space disentanglement applied to in vitro fertilization," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
Seung Seog Han & Ik Jun Moon & Seong Hwan Kim & Jung-Im Na & Myoung Shin Kim & Gyeong Hun Park & Ilwoo Park & Keewon Kim & Woohyung Lim & Ju Hee Lee & Sung Eun Chang, 2020. "Assessment of deep neural networks for the diagnosis of benign and malignant skin neoplasms in comparison with dermatologists: A retrospective validation study," PLOS Medicine, Public Library of Science, vol. 17(11), pages 1-21, November.
Mingzhu Liu & Chirag Nagpal & Artur Dubrawski, 2024. "Deep Survival Models Can Improve Long-Term Mortality Risk Estimates from Chest Radiographs," Forecasting, MDPI, vol. 6(2), pages 1-14, May.
Caplin, Andrew & Martin, Daniel & Marx, Philip, 2025. "Modeling machine learning: A cognitive economic approach," Journal of Economic Theory, Elsevier, vol. 224(C).
- Andrew Caplin & Daniel J. Martin & Philip Marx, 2022. "Modeling Machine Learning: A Cognitive Economic Approach," NBER Working Papers 30600, National Bureau of Economic Research, Inc.
Mingsi Liu & Jinghui Wu & Nian Wang & Xianqin Zhang & Yujiao Bai & Jinlin Guo & Lin Zhang & Shulin Liu & Ke Tao, 2023. "The value of artificial intelligence in the diagnosis of lung cancer: A systematic review and meta-analysis," PLOS ONE, Public Library of Science, vol. 18(3), pages 1-20, March.
Eun Young Kim & Young Jae Kim & Won-Jun Choi & Gi Pyo Lee & Ye Ra Choi & Kwang Nam Jin & Young Jun Cho, 2021. "Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort," PLOS ONE, Public Library of Science, vol. 16(2), pages 1-12, February.
Weijie Fan & Yi Yang & Jing Qi & Qichuan Zhang & Cuiwei Liao & Li Wen & Shuang Wang & Guangxian Wang & Yu Xia & Qihua Wu & Xiaotao Fan & Xingcai Chen & Mi He & JingJing Xiao & Liu Yang & Yun Liu & Jia, 2024. "A deep-learning-based framework for identifying and localizing multiple abnormalities and assessing cardiomegaly in chest X-ray," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
Tianyu Han & Sven Nebelung & Federico Pedersoli & Markus Zimmermann & Maximilian Schulze-Hagen & Michael Ho & Christoph Haarburger & Fabian Kiessling & Christiane Kuhl & Volkmar Schulz & Daniel Truhn, 2021. "Advancing diagnostic performance and clinical usability of neural networks via adversarial training and dual batch normalization," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
David Autor & Andrew Caplin & Daniel Martin & Philip Marx, 2025. "Misaligned by Design: Incentive Failures in Machine Learning," Papers 2511.07699, arXiv.org.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pmed00:1002686. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosmedicine (email available below). General contact details of provider: https://journals.plos.org/plosmedicine/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists

Author

Abstract

Suggested Citation

Download full text from publisher

Citations

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data