IDEAS home Printed from https://ideas.repec.org/a/spr/drugsa/v48y2025i6d10.1007_s40264-025-01525-w.html
   My bibliography  Save this article

Effectiveness of Transformer-Based Large Language Models in Identifying Adverse Drug Reaction Relations from Unstructured Discharge Summaries in Singapore

Author

Listed:
  • Yen Ling Koon

    (Health Sciences Authority)

  • Yan Tung Lam

    (Health Sciences Authority)

  • Hui Xing Tan

    (Health Sciences Authority)

  • Desmond Hwee Chun Teo

    (Health Sciences Authority)

  • Jing Wei Neo

    (Health Sciences Authority)

  • Aaron Jun Yi Yap

    (Health Sciences Authority)

  • Pei San Ang

    (Health Sciences Authority)

  • Celine Ping Wei Loke

    (Health Sciences Authority)

  • Mun Yee Tham

    (Health Sciences Authority)

  • Siew Har Tan

    (Health Sciences Authority)

  • Sally Leng Bee Soh

    (Health Sciences Authority)

  • Belinda Qin Pei Foo

    (Health Sciences Authority)

  • Zheng Jye Ling

    (National University of Singapore, National University Health System)

  • James Luen Wei Yip

    (National University Heart Centre
    National University Health System)

  • Sreemanee Raaj Dorajoo

    (Health Sciences Authority)

Abstract

Introduction Transformer-based large language models (LLMs) have transformed the field of natural language processing and led to significant advancements in various text processing tasks. However, the applicability of these LLMs in identifying related drug-adverse event (AE) pairs within clinical context may be limited by the prevalent use of non-standard sentence structures and grammar. Method Nine transformer-based LLMs pre-trained on biomedical domain corpora are fine-tuned on annotated data (n = 5088) to classify drug-AE pairs in unstructured discharge summaries as causally related or unrelated. These LLMs are then validated on text segments from deidentified hospital discharge summaries from Singapore (n = 1647). To assess generalisability, the models are validated on annotated segments (n = 4418) from the Medical Information Mart for Intensive Care (MIMIC-III) database. Performance of LLMs in identifying related drug-AE pairs is then compared against a prior benchmark set by traditional machine learning models on the same data. Results Using an LLM-Bidirectional long short-term memory (LLM-BiLSTM) architecture, transformer-based LLMs improve F1 score as compared to prior benchmark with BioM-ELECTRA-Large-BiLSTM showing an average F1 score improvement of 16.1% (increase from 0.64 to 0.74). Applying additional rules on the LLM-based predictions, like ignoring drug-AE pairs when the AE is a known indication of the drug, results in a further reduction in false positive rates with precision increases of up to 5.6% (0.04 increment). Conclusion Transformer-based LLMs outperform traditional machine learning methods in identifying causally related drug-AE pairs embedded within unstructured discharge summaries. Nonetheless the improvement in performance with rules indicates that LLMs still possess some degree of imperfection for this causal relation detection task.

Suggested Citation

  • Yen Ling Koon & Yan Tung Lam & Hui Xing Tan & Desmond Hwee Chun Teo & Jing Wei Neo & Aaron Jun Yi Yap & Pei San Ang & Celine Ping Wei Loke & Mun Yee Tham & Siew Har Tan & Sally Leng Bee Soh & Belinda , 2025. "Effectiveness of Transformer-Based Large Language Models in Identifying Adverse Drug Reaction Relations from Unstructured Discharge Summaries in Singapore," Drug Safety, Springer, vol. 48(6), pages 667-677, June.
  • Handle: RePEc:spr:drugsa:v:48:y:2025:i:6:d:10.1007_s40264-025-01525-w
    DOI: 10.1007/s40264-025-01525-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40264-025-01525-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40264-025-01525-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:drugsa:v:48:y:2025:i:6:d:10.1007_s40264-025-01525-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com/economics/journal/40264 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.