Author
Listed:
- Thierry Jean
- Rose Guay Hottin
- Pierre Orban
Abstract
The promise of machine learning successfully exploiting digital phenotyping data to forecast mental states in psychiatric populations could greatly improve clinical practice. Previous research focused on binary classification and continuous regression, disregarding the often ordinal nature of prediction targets derived from clinical rating scales. In addition, mental health ratings typically show important class imbalance or skewness that need to be accounted for when evaluating predictive performance. Besides it remains unclear which machine learning algorithm is best suited for forecast tasks, the eXtreme Gradient Boosting (XGBoost) and long short-term memory (LSTM) algorithms being 2 popular choices in digital phenotyping studies. The CrossCheck dataset includes 6,364 mental state surveys using 4-point ordinal rating scales and 23,551 days of smartphone sensor data contributed by patients with schizophrenia. We trained 120 machine learning models to forecast 10 mental states (e.g., Calm, Depressed, Seeing things) from passive sensor data on 2 predictive tasks (ordinal regression, binary classification) with 2 learning algorithms (XGBoost, LSTM) over 3 forecast horizons (same day, next day, next week). A majority of ordinal regression and binary classification models performed significantly above baseline, with macro-averaged mean absolute error values between 1.19 and 0.77, and balanced accuracy between 58% and 73%, which corresponds to similar levels of performance when these metrics are scaled. Results also showed that metrics that do not account for imbalance (mean absolute error, accuracy) systematically overestimated performance, XGBoost models performed on par with or better than LSTM models, and a significant yet very small decrease in performance was observed as the forecast horizon expanded. In conclusion, when using performance metrics that properly account for class imbalance, ordinal forecast models demonstrated comparable performance to the prevalent binary classification approach without losing valuable clinical information from self-reports, thus providing richer and easier to interpret predictions.Author summary: Symptoms associated with mental health disorders vary greatly over time. Periods of partial remission unfortunately alternate with relapses defined by a marked worsening of symptoms. Hence, assessing future risk and adopting preventive measures is a key challenge for clinical psychiatry. With their many sensors, smartphones can provide novel insights into human behavior outside the medical office. By using machine learning, a branch of artificial intelligence, it is possible to use such smartphone sensor data to predict future mental states and symptoms in psychiatric patients. The present work highlights the importance of predicting fine-grained levels of symptom severity, as commonly reported by patients using so-called ordinal rating scales. Such ordinal predictions were not less accurate than the simplified binary predictions (on/off, high/low) often reported in previous efforts. Besides, we underscore that severe mental states are rare compared to healthy ones, and that this imbalance brings methodological challenges that need to be taken into account to develop valid predictive models.
Suggested Citation
Thierry Jean & Rose Guay Hottin & Pierre Orban, 2025.
"Forecasting mental states in schizophrenia using digital phenotyping data,"
PLOS Digital Health, Public Library of Science, vol. 4(2), pages 1-20, February.
Handle:
RePEc:plo:pdig00:0000734
DOI: 10.1371/journal.pdig.0000734
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000734. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.