IDEAS home Printed from https://ideas.repec.org/a/plo/pdig00/0001194.html

Moving beyond the empty cell: The threat of decontextualized healthcare data

Author

Listed:
  • Aya El Mir
  • Eric Bezerra de Sousa
  • Ignacio Mesina-Estarrón
  • Leo Anthony Celi
  • Moad Hani
  • Mohammed Benjelloun
  • Neha Nageswaran
  • Saïd Mahmoudi
  • Shaheen Siddiqui
  • Sreeram Sadasivam
  • William Greig Mitchell

Abstract

Missing, inaccurate, or poorly documented data in healthcare is often treated as a technical problem to be statistically resolved via imputation, deletion, or modeling assumptions about randomness. However, such inaccuracies relate to far more complex socioeconomic and geopolitical issues, rather than “errors of data entry” to be ameliorated with statistical modeling techniques. We outline that what is really missing or inaccurate is the context in which the data is collected—and that only by understanding this context can we begin to prevent artificial intelligence’s (AIs) amplification of misleading, decontextualized data. We critically examine how traditional modeling methods fail to account for the factors that influence what data gets recorded, and for whom. We show how AI systems trained on decontextualized data reinforce health inequities at scale. And, we review recent literature on context-aware approaches to understanding data, that incorporate metadata, social determinants of health, fairness constraints, and participatory governance to build more ethical and representative systems. Our analysis urges the AI and healthcare communities to move beyond the traditional emphasis on statistical convenience, toward socially grounded and interdisciplinary strategies for handling decontextualized data.Author summary: Healthcare data that is missing, incomplete, or inaccurately documented is often treated as a technical problem to be solved with statistical methods. We emphasize that this perspective overlooks the real issue: the data has been stripped of its context. Missing, incomplete, or inaccurate data (collectively termed decontextualized data) is not random; it is shaped by human decisions, social barriers, and systemic inequalities. Decontextualized healthcare data becomes increasingly dangerous as the use of AI in healthcare proliferates. Models trained on decontextualized data learn existing distortions as if objective truths. Consequently, their predictions risk reinforcing the very inequities that caused the flawed data in the first place and exacerbating health disparities at scale. We argue for a paradigm shift towards understanding why data becomes decontextualized. This requires a concerted effort between machine learning communities and domain experts who understand data context. It is only through this partnership that we can begin to build models that account for the complex realities embedded in decontextualized healthcare data that cannot be solved by sophisticated modeling techniques alone.

Suggested Citation

  • Aya El Mir & Eric Bezerra de Sousa & Ignacio Mesina-Estarrón & Leo Anthony Celi & Moad Hani & Mohammed Benjelloun & Neha Nageswaran & Saïd Mahmoudi & Shaheen Siddiqui & Sreeram Sadasivam & William Gre, 2026. "Moving beyond the empty cell: The threat of decontextualized healthcare data," PLOS Digital Health, Public Library of Science, vol. 5(1), pages 1-9, January.
  • Handle: RePEc:plo:pdig00:0001194
    DOI: 10.1371/journal.pdig.0001194
    as

    Download full text from publisher

    File URL: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0001194
    Download Restriction: no

    File URL: https://journals.plos.org/digitalhealth/article/file?id=10.1371/journal.pdig.0001194&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pdig.0001194?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. repec:nas:journl:v:115:y:2018:p:8569-8574 is not listed on IDEAS
    2. William Greig Mitchell & Judy Gichoya Wawira & Leo Anthony Celi, 2025. "Rebooting artificial intelligence for health," PLOS Global Public Health, Public Library of Science, vol. 5(1), pages 1-3, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      More about this item

      Statistics

      Access and download statistics

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0001194. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.