IDEAS home Printed from https://ideas.repec.org/a/sae/risrel/v239y2025i6p1257-1264.html

A knowledge-informed large language model framework for U.S. nuclear power plant shutdown initiating event classification for probabilistic risk assessment

Author

Listed:
  • Min Xian
  • Tao Wang
  • Sai Zhang
  • Fei Xu
  • Zhegang Ma

Abstract

Identifying and classifying shutdown initiating events (SDIEs) is critical for developing shutdown probabilistic risk assessment for nuclear power plants. Existing computational approaches cannot achieve satisfactory performance due to the challenges of unavailable large, labeled datasets, imbalanced event types, and label noise. To address these challenges, we propose a hybrid pipeline that integrates a knowledge-informed machine learning model to prescreen non-SDIEs and a large language model (LLM) to classify SDIEs into four types. In the prescreening stage, we proposed a set of 44 SDIE text patterns that consist of the most salient keywords and phrases from six SDIE types. Text vectorization based on the SDIE patterns generates feature vectors that are highly separable by using a simple binary classifier. The second stage builds Bidirectional Encoder Representations from Transformers (BERT)-based LLM, which learns generic English language representations from self-supervised pretraining on a large dataset and adapts to SDIE classification by fine-tuning it on an SDIE dataset. The proposed approaches are evaluated on a dataset with 10,928 events using precision, recall ratio, F 1 score, and average accuracy. The results demonstrate that the prescreening stage can exclude more than 97% non-SDIEs, and the LLM achieves an average accuracy of 95.1% for SDIE classification.

Suggested Citation

  • Min Xian & Tao Wang & Sai Zhang & Fei Xu & Zhegang Ma, 2025. "A knowledge-informed large language model framework for U.S. nuclear power plant shutdown initiating event classification for probabilistic risk assessment," Journal of Risk and Reliability, , vol. 239(6), pages 1257-1264, December.
  • Handle: RePEc:sae:risrel:v:239:y:2025:i:6:p:1257-1264
    DOI: 10.1177/1748006X251386900
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/1748006X251386900
    Download Restriction: no

    File URL: https://libkey.io/10.1177/1748006X251386900?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:risrel:v:239:y:2025:i:6:p:1257-1264. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.