Author
Listed:
- Jae-Joong Lee
- Jihoon Han
- Choong-Wan Woo
Abstract
Detecting depression from conversational text using large language models (LLMs) has garnered significant interest. However, the limited interpretability of existing methods presents a major challenge for clinical application. To address this, we propose a novel framework for automatic depression assessment, which employs LLM prompting to extract interpretable factors linked to depression from text and uses linear regression to predict severity scores. We evaluated our approach using a benchmark dataset (DAIC-WOZ; n = 186), predicting Patient Health Questionnaire (PHQ)-8 scores from clinical interview transcripts. Our method identifies key behavioral and linguistic features indicative of depression while also achieving state-of-the-art performance with a mean absolute error (MAE) of 2.91 on the test set. The resulting model further generalizes to an independent test dataset (E-DAIC; n = 86) with an MAE of 2.86. These findings suggest that interpretable LLM-based approaches hold significant promise for enhancing the clinical utility of automated depression assessment.Author summary: Depression is a common and serious mental health concern, and there is a growing need to develop fast and accessible screening tools. Recently, detecting depression from conversational texts using large language models (LLMs) has emerged as a promising solution. However, most LLM-based methods operate as “black-box” models that provide little insight into how decisions are made, limiting their use in clinical settings. In this study, we propose a novel framework to enhance the interpretability of LLM-based depression assessment. Rather than asking an LLM to provide a single overall assessment, we prompt it to evaluate a set of specific depression-related factors in the text, spanning clinical symptoms, linguistic patterns, and cognitive distortions. These factor scores are then used in a linear regression model to predict depression severity, enabling a transparent understanding of which features contribute to the prediction. When evaluated on a benchmark clinical interview dataset, our method achieves state-of-the-art performance while also identifying key behavioral and linguistic markers of depression. Moreover, the resulting model further generalizes to an independent test dataset. These findings suggest that interpretable LLM-based approaches hold significant promise for enhancing the clinical utility of automated depression assessment.
Suggested Citation
Jae-Joong Lee & Jihoon Han & Choong-Wan Woo, 2026.
"Interpretable depression assessment using a large language model,"
PLOS Digital Health, Public Library of Science, vol. 5(2), pages 1-18, February.
Handle:
RePEc:plo:pdig00:0001205
DOI: 10.1371/journal.pdig.0001205
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0001205. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.