IDEAS home Printed from https://ideas.repec.org/p/ecl/stabus/3950.html

Uncovering Interpretable Potential Confounders in Electronic Medical Records

Author

Listed:
  • Zeng, Jiaming

    (Stanford University)

  • Gensheimer, Michael F.

    (Stanford University)

  • Rubin, Daniel L.

    (Stanford University)

  • Athey, Susan

    (Stanford University)

  • Schachter, Ross D.

    (Stanford University)

Abstract

In medicine, randomized clinical trials are the gold standard for informing treatment decisions. Observational comparative effectiveness research is often plagued by selection bias, and expert-selected covariates may not be sufficient to adjust for confounding. We explore how the unstructured clinical text in electronic medical records can be used to reduce selection bias and improve medical practice. We develop a method based on natural language processing to uncover interpretable potential confounders from the clinical text. We validate our method by comparing the hazard ratio (HR) from survival analysis with and without the confounders against the results from established RCTs. We apply our method to four study cohorts built from localized prostate and lung cancer datasets from the Stanford Cancer Institute Research Database and show that our method adjusts the HR estimate towards the RCT results. We further confirm that the uncovered terms can be interpreted by an oncologist as potential confounders. This research helps enable more credible causal inference using data from EMRs, offers a transparent way to improve the design of observational CER, and could inform high-stake medical decisions. Our method can also be applied to studies within and beyond medicine to extract important information from observational data to support decisions.

Suggested Citation

  • Zeng, Jiaming & Gensheimer, Michael F. & Rubin, Daniel L. & Athey, Susan & Schachter, Ross D., 2021. "Uncovering Interpretable Potential Confounders in Electronic Medical Records," Research Papers 3950, Stanford University, Graduate School of Business.
  • Handle: RePEc:ecl:stabus:3950
    as

    Download full text from publisher

    File URL: https://www.gsb.stanford.edu/faculty-research/working-papers/uncovering-interpretable-potential-confounders-electronic-medical
    Download Restriction: no
    ---><---

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Carrizosa, Emilio & Ramírez-Ayerbe, Jasone & Romero Morales, Dolores, 2024. "Mathematical optimization modelling for group counterfactual explanations," European Journal of Operational Research, Elsevier, vol. 319(2), pages 399-412.
    2. Felix Drinkall & Stefan Zohren & Michael McMahon & Janet B. Pierrehumbert, 2025. "Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups," Papers 2502.14497, arXiv.org.
    3. Takanobu Hirosawa & Yukinori Harada & Masashi Yokose & Tetsu Sakamoto & Ren Kawamura & Taro Shimizu, 2023. "Diagnostic Accuracy of Differential-Diagnosis Lists Generated by Generative Pretrained Transformer 3 Chatbot for Clinical Vignettes with Common Chief Complaints: A Pilot Study," IJERPH, MDPI, vol. 20(4), pages 1-10, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ecl:stabus:3950. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/gsstaus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.