Author
Listed:
- Jia Wei
- Kevin Yuan
- Augustine Luk
- A Sarah Walker
- David W Eyre
Abstract
Community-acquired pneumonia (CAP) is common and a significant cause of mortality. However, CAP surveillance commonly relies on diagnostic codes from electronic health records (EHRs), with imperfect accuracy. We used Bayesian latent class models with multiple imputation to assess the accuracy of CAP diagnostic codes in the absence of a gold standard and to explore the contribution of various EHR data sources in improving CAP identification. Using 491,681 hospital admissions in Oxfordshire, UK, from 2016 to 2023, we investigated four EHR-based algorithms for CAP detection based on 1) primary diagnostic codes, 2) clinician-documented indications for antibiotic prescriptions, 3) radiology free-text reports, and 4) vital signs and blood tests. The estimated prevalence of CAP as the reason for emergency hospital admission was 13.6% (95% credible interval 13.3-14.0%). Primary diagnostic codes had low sensitivity but a high specificity (best fitting model, 0.275 and 0.997 respectively), as did vital signs with blood tests (0.348 and 0.963). Antibiotic indication text had a higher sensitivity (0.590) but a lower specificity (0.982), with radiology reports intermediate (0.485 and 0.960). Defining CAP as present when detected by any algorithm produced sensitivity and specificity of 0.873 and 0.905 respectively. Results remained consistent using alternative priors and in sensitivity analyses. Relying solely on diagnostic codes for CAP surveillance leads to substantial under-detection; combining EHR data across multiple algorithms enhances identification accuracy. Bayesian latent class analysis-based approaches could improve CAP surveillance and epidemiological estimates by integrating multiple EHR sources, even without a gold standard for CAP diagnosis.Author summary: Community-acquired pneumonia (CAP) is a common and serious illness, but current surveillance usually relies on diagnostic codes from electronic health records (EHRs), which can be inaccurate. In this study, we analysed half a million hospital admissions in Oxfordshire, UK (2016–2023), to evaluate how well different EHR-based methods detect CAP. Since there is no perfect diagnostic test for CAP, we used Bayesian latent class models to estimate the performance of each method. We found that relying solely on primary diagnostic codes often misses CAP cases, while other data including antibiotic prescriptions, radiology reports, and patients’ vital signs and blood tests could improve detection. When combined all data sources together, the performance of identifying CAP increased significantly. Our findings suggest that integrating multiple types of EHR data improves CAP identification. This approach could support more accurate disease surveillance and a better understanding of CAP prevalence, even in the absence of a perfect diagnostic standard.
Suggested Citation
Jia Wei & Kevin Yuan & Augustine Luk & A Sarah Walker & David W Eyre, 2025.
"Community-acquired pneumonia identification from electronic health records in the absence of a gold standard: A Bayesian latent class analysis,"
PLOS Digital Health, Public Library of Science, vol. 4(7), pages 1-15, July.
Handle:
RePEc:plo:pdig00:0000936
DOI: 10.1371/journal.pdig.0000936
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000936. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.