Author
Listed:
- Yen Ling Koon
(Health Sciences Authority)
- Yan Tung Lam
(Health Sciences Authority)
- Hui Xing Tan
(Health Sciences Authority)
- Desmond Hwee Chun Teo
(Health Sciences Authority)
- Jing Wei Neo
(Health Sciences Authority)
- Aaron Jun Yi Yap
(Health Sciences Authority)
- Pei San Ang
(Health Sciences Authority)
- Celine Ping Wei Loke
(Health Sciences Authority)
- Mun Yee Tham
(Health Sciences Authority)
- Siew Har Tan
(Health Sciences Authority)
- Sally Leng Bee Soh
(Health Sciences Authority)
- Belinda Qin Pei Foo
(Health Sciences Authority)
- Zheng Jye Ling
(National University of Singapore, National University Health System)
- James Luen Wei Yip
(National University Heart Centre
National University Health System)
- Sreemanee Raaj Dorajoo
(Health Sciences Authority)
Abstract
Introduction Transformer-based large language models (LLMs) have transformed the field of natural language processing and led to significant advancements in various text processing tasks. However, the applicability of these LLMs in identifying related drug-adverse event (AE) pairs within clinical context may be limited by the prevalent use of non-standard sentence structures and grammar. Method Nine transformer-based LLMs pre-trained on biomedical domain corpora are fine-tuned on annotated data (n = 5088) to classify drug-AE pairs in unstructured discharge summaries as causally related or unrelated. These LLMs are then validated on text segments from deidentified hospital discharge summaries from Singapore (n = 1647). To assess generalisability, the models are validated on annotated segments (n = 4418) from the Medical Information Mart for Intensive Care (MIMIC-III) database. Performance of LLMs in identifying related drug-AE pairs is then compared against a prior benchmark set by traditional machine learning models on the same data. Results Using an LLM-Bidirectional long short-term memory (LLM-BiLSTM) architecture, transformer-based LLMs improve F1 score as compared to prior benchmark with BioM-ELECTRA-Large-BiLSTM showing an average F1 score improvement of 16.1% (increase from 0.64 to 0.74). Applying additional rules on the LLM-based predictions, like ignoring drug-AE pairs when the AE is a known indication of the drug, results in a further reduction in false positive rates with precision increases of up to 5.6% (0.04 increment). Conclusion Transformer-based LLMs outperform traditional machine learning methods in identifying causally related drug-AE pairs embedded within unstructured discharge summaries. Nonetheless the improvement in performance with rules indicates that LLMs still possess some degree of imperfection for this causal relation detection task.
Suggested Citation
Yen Ling Koon & Yan Tung Lam & Hui Xing Tan & Desmond Hwee Chun Teo & Jing Wei Neo & Aaron Jun Yi Yap & Pei San Ang & Celine Ping Wei Loke & Mun Yee Tham & Siew Har Tan & Sally Leng Bee Soh & Belinda , 2025.
"Effectiveness of Transformer-Based Large Language Models in Identifying Adverse Drug Reaction Relations from Unstructured Discharge Summaries in Singapore,"
Drug Safety, Springer, vol. 48(6), pages 667-677, June.
Handle:
RePEc:spr:drugsa:v:48:y:2025:i:6:d:10.1007_s40264-025-01525-w
DOI: 10.1007/s40264-025-01525-w
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:drugsa:v:48:y:2025:i:6:d:10.1007_s40264-025-01525-w. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com/economics/journal/40264 .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.