Author
Listed:
- Helene Bei Thomsen
- Livie Yumeng Li
- Anders Aasted Isaksen
- Benjamin Lebiecka-Johansen
- Charline Bour
- Guy Fagherazzi
- William P T M van Doorn
- Tibor V Varga
- Adam Hulman
Abstract
Non-Hispanic white (White) populations are overrepresented in medical studies. Potential healthcare disparities can happen when machine learning models, used in diabetes technologies, are trained on data from primarily White patients. We aimed to evaluate algorithmic fairness in glucose predictions. This study utilized continuous glucose monitoring (CGM) data from 101 White and 104 Black participants with type 1 diabetes collected by the JAEB Center for Health Research, US. Long short-term memory (LSTM) deep learning models were trained on 11 datasets of different proportions of White and Black participants and tailored to each individual using transfer learning to predict glucose 60 minutes ahead based on 60-minute windows. Root mean squared errors (RMSE) were calculated for each participant. Linear mixed-effect models were used to investigate the association between racial composition and RMSE while accounting for age, sex, and training data size. A median of 9 weeks (IQR: 7, 10) of CGM data was available per participant. The divergence in performance (RMSE slope by proportion) was not statistically significant for either group. However, the slope difference (from 0% White and 100% Black to 100% White and 0% Black) between groups was statistically significant (p = 0.02), meaning the RMSE increased 0.04 [0.01, 0.08] mmol/L more for Black participants compared to White participants when the proportion of White participants increased from 0 to 100% in the training data. This difference was attenuated in the transfer learned models (RMSE: 0.02 [-0.01, 0.05] mmol/L, p = 0.20). The racial composition of training data created a small statistically significant difference in the performance of the models, which was not present after using transfer learning. This demonstrates the importance of diversity in datasets and the potential value of transfer learning for developing more fair prediction models.Author summary: Non-Hispanic White populations are often overrepresented in medical datasets. Training machine learning models on such data may lead to unfair clinical prediction tools and an unfavorable impact on healthcare inequalities. This study investigated how well machine learning models perform in predicting blood sugar levels for Non-Hispanic White and Non-Hispanic Black people with type 1 diabetes. We used continuous glucose monitoring (CGM) data from people with type 1 diabetes living in the US to compare various methods and models trained on datasets with different proportions of White and Black participants. We found a difference between the performance improvement in White and the performance drop in Black participants as the proportion of White participants increased in the dataset used for training. This difference disappeared when models were further tailored to individuals. Our work demonstrates the importance of using diverse training data when developing AI-based solutions for healthcare.
Suggested Citation
Helene Bei Thomsen & Livie Yumeng Li & Anders Aasted Isaksen & Benjamin Lebiecka-Johansen & Charline Bour & Guy Fagherazzi & William P T M van Doorn & Tibor V Varga & Adam Hulman, 2025.
"Racial disparities in continuous glucose monitoring-based 60-min glucose predictions among people with type 1 diabetes,"
PLOS Digital Health, Public Library of Science, vol. 4(6), pages 1-13, June.
Handle:
RePEc:plo:pdig00:0000918
DOI: 10.1371/journal.pdig.0000918
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000918. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.