IDEAS home Printed from https://ideas.repec.org/a/plo/pdig00/0000486.html
   My bibliography  Save this article

Diversity and inclusion: A hidden additional benefit of Open Data

Author

Listed:
  • Marie-Laure Charpignon
  • Leo Anthony Celi
  • Marisa Cobanaj
  • Rene Eber
  • Amelia Fiske
  • Jack Gallifant
  • Chenyu Li
  • Gurucharan Lingamallu
  • Anton Petushkov
  • Robin Pierce

Abstract

The recent imperative by the National Institutes of Health to share scientific data publicly underscores a significant shift in academic research. Effective as of January 2023, it emphasizes that transparency in data collection and dedicated efforts towards data sharing are prerequisites for translational research, from the lab to the bedside. Given the role of data access in mitigating potential bias in clinical models, we hypothesize that researchers who leverage open-access datasets rather than privately-owned ones are more diverse. In this brief report, we proposed to test this hypothesis in the transdisciplinary and expanding field of artificial intelligence (AI) for critical care. Specifically, we compared the diversity among authors of publications leveraging open datasets, such as the commonly used MIMIC and eICU databases, with that among authors of publications relying exclusively on private datasets, unavailable to other research investigators (e.g., electronic health records from ICU patients accessible only to Mayo Clinic analysts). To measure the extent of author diversity, we characterized gender balance as well as the presence of researchers from low- and middle-income countries (LMIC) and minority-serving institutions (MSI) located in the United States (US). Our comparative analysis revealed a greater contribution of authors from LMICs and MSIs among researchers leveraging open critical care datasets (treatment group) than among those relying exclusively on private data resources (control group). The participation of women was similar between the two groups, albeit slightly larger in the former. Notably, although over 70% of all articles included at least one author inferred to be a woman, less than 25% had a woman as a first or last author. Importantly, we found that the proportion of authors from LMICs was substantially higher in the treatment than in the control group (10.1% vs. 6.2%, p

Suggested Citation

  • Marie-Laure Charpignon & Leo Anthony Celi & Marisa Cobanaj & Rene Eber & Amelia Fiske & Jack Gallifant & Chenyu Li & Gurucharan Lingamallu & Anton Petushkov & Robin Pierce, 2024. "Diversity and inclusion: A hidden additional benefit of Open Data," PLOS Digital Health, Public Library of Science, vol. 3(7), pages 1-17, July.
  • Handle: RePEc:plo:pdig00:0000486
    DOI: 10.1371/journal.pdig.0000486
    as

    Download full text from publisher

    File URL: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000486
    Download Restriction: no

    File URL: https://journals.plos.org/digitalhealth/article/file?id=10.1371/journal.pdig.0000486&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pdig.0000486?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Todd C. Knepper & Howard L. McLeod, 2018. "When will clinical trials finally reflect diversity?," Nature, Nature, vol. 557(7704), pages 157-159, May.
    2. Max Kozlov, 2022. "NIH issues a seismic mandate: share data publicly," Nature, Nature, vol. 602(7898), pages 558-559, February.
    3. repec:plo:pone00:0210232 is not listed on IDEAS
    4. Alexander D VanHelene & Ishaani Khatri & C Beau Hilton & Sanjay Mishra & Ece D Gamsiz Uzun & Jeremy L Warner, 2024. "Inferring gender from first names: Comparing the accuracy of Genderize, Gender API, and the gender R package on authors of diverse nationality," PLOS Digital Health, Public Library of Science, vol. 3(10), pages 1-15, October.
    5. Lama H Nazer & Razan Zatarah & Shai Waldrip & Janny Xue Chen Ke & Mira Moukheiber & Ashish K Khanna & Rachel S Hicklen & Lama Moukheiber & Dana Moukheiber & Haobo Ma & Piyush Mathur, 2023. "Bias in artificial intelligence algorithms and recommendations for mitigation," PLOS Digital Health, Public Library of Science, vol. 2(6), pages 1-14, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Federica Cugnata & Chiara Brombin & Chiara Maria Poli & Roberto Buccione & Clelia Serio, 2024. "Modelling perception and resilience factors to data sharing in clinical and basic research: an observational study," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(6), pages 3169-3192, June.
    2. Celine E Snedden & James O Lloyd-Smith, 2024. "Predicting the presence of infectious virus from PCR data: A meta-analysis of SARS-CoV-2 in non-human primates," PLOS Pathogens, Public Library of Science, vol. 20(4), pages 1-39, April.
    3. Hamish S Fraser & Alvin Marcelo & Mahima Kalla & Khumbo Kalua & Leo A Celi & Jennifer Ziegler, 2023. "Digital determinants of health: Editorial," PLOS Digital Health, Public Library of Science, vol. 2(11), pages 1-4, November.
    4. Lisa Gibbs & Alexander J. Thomas & Alison Coelho & Adil Al-Qassas & Karen Block & Niamh Meagher & Limya Eisa & Stephanie Fletcher-Lartey & Tianhui Ke & Phoebe Kerr & Edwin Jit Leung Kwong & Colin MacD, 2023. "Inclusion of Cultural and Linguistic Diversity in COVID-19 Public Health Research: Research Design Adaptations to Seek Different Perspectives in Victoria, Australia," IJERPH, MDPI, vol. 20(3), pages 1-17, January.
    5. Brandy M Mapes & Christopher S Foster & Sheila V Kusnoor & Marcia I Epelbaum & Mona AuYoung & Gwynne Jenkins & Maria Lopez-Class & Dara Richardson-Heron & Ahmed Elmi & Karl Surkan & Robert M Cronin & , 2020. "Diversity and inclusion for the All of Us research program: A scoping review," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-14, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000486. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.