IDEAS home Printed from https://ideas.repec.org/a/spr/joinma/v36y2025i1d10.1007_s10845-023-02245-7.html
   My bibliography  Save this article

Natural language processing (NLP) and association rules (AR)-based knowledge extraction for intelligent fault analysis: a case study in semiconductor industry

Author

Listed:
  • Zhiqiang Wang

    (Research Center)

  • Kenneth Ezukwoke

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Henri FAYOL Institute)

  • Anis Hoayek

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Henri FAYOL Institute)

  • Mireille Batton-Hubert

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Henri FAYOL Institute)

  • Xavier Boucher

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Center for Biomedical and Healthcare Engineering)

Abstract

Fault analysis (FA) is the process of collecting and analyzing data to determine the cause of a failure. It plays an important role in ensuring the quality in manufacturing process. Traditional FA techniques are time-consuming and labor-intensive, relying heavily on human expertise and the availability of failure inspection equipment. In semiconductor industry, a large amount of FA reports are generated by experts to record the fault descriptions, fault analysis path and fault root causes. With the development of Artificial Intelligence, it is possible to automate the industrial FA process while extracting expert knowledge from the vast FA report data. The goal of this research is to develop a complete expert knowledge extraction pipeline for FA in semiconductor industry based on advanced Natural Language Processing and Machine Learning. Our research aims at automatically predicting the fault root cause based on the fault descriptions. First, the text data from the FA reports are transformed into numerical data using Sentence Transformer embedding. The numerical data are converted into latent spaces using Generalized-Controllable Variational AutoEncoder. Then, the latent spaces are classified by Gaussian Mixture Model. Finally, Association Rules are applied to establish the relationship between the labels in the latent space of the fault descriptions and that of the fault root cause. The proposed algorithm has been evaluated with real data of semiconductor industry collected over three years. The average correctness of the predicted label achieves 97.8%. The method can effectively reduce the time of failure identification and the cost during the inspection stage.

Suggested Citation

  • Zhiqiang Wang & Kenneth Ezukwoke & Anis Hoayek & Mireille Batton-Hubert & Xavier Boucher, 2025. "Natural language processing (NLP) and association rules (AR)-based knowledge extraction for intelligent fault analysis: a case study in semiconductor industry," Journal of Intelligent Manufacturing, Springer, vol. 36(1), pages 357-372, January.
  • Handle: RePEc:spr:joinma:v:36:y:2025:i:1:d:10.1007_s10845-023-02245-7
    DOI: 10.1007/s10845-023-02245-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10845-023-02245-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10845-023-02245-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Xu, Zhaoyi & Saleh, Joseph Homer, 2021. "Machine learning for reliability engineering and safety applications: Review of current status and future opportunities," Reliability Engineering and System Safety, Elsevier, vol. 211(C).
    2. Hahsler, Michael & Grün, Bettina & Hornik, Kurt, 2005. "arules - A Computational Environment for Mining Association Rules and Frequent Item Sets," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i15).
    3. Chia-Yu Hsu & Wei-Chen Liu, 2021. "Multiple time-series convolutional neural network for fault detection and diagnosis and empirical study in semiconductor manufacturing," Journal of Intelligent Manufacturing, Springer, vol. 32(3), pages 823-836, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Li, Yuanfu & Chen, Yao & Hu, Zhenchao & Zhang, Huisheng, 2023. "Remaining useful life prediction of aero-engine enabled by fusing knowledge and deep learning models," Reliability Engineering and System Safety, Elsevier, vol. 229(C).
    2. Liu, Jiale & Wang, Huan, 2024. "A brain-inspired energy-efficient Wide Spiking Residual Attention Framework for intelligent fault diagnosis," Reliability Engineering and System Safety, Elsevier, vol. 243(C).
    3. Zhang, Dingyang & Zhang, Yiming & Li, Pei & Zhang, Shuyou, 2025. "Kernel Reinforcement Learning for sampling-efficient risk management of large-scale engineering systems," Reliability Engineering and System Safety, Elsevier, vol. 260(C).
    4. Yuan, Zixia & Xiong, Guojiang & Fu, Xiaofan & Mohamed, Ali Wagdy, 2023. "Improving fault tolerance in diagnosing power system failures with optimal hierarchical extreme learning machine," Reliability Engineering and System Safety, Elsevier, vol. 236(C).
    5. Chen, Edward & Bao, Han & Dinh, Nam, 2024. "Evaluating the reliability of machine-learning-based predictions used in nuclear power plant instrumentation and control systems," Reliability Engineering and System Safety, Elsevier, vol. 250(C).
    6. Costa, Nahuel & Sánchez, Luciano, 2022. "Variational encoding approach for interpretable assessment of remaining useful life estimation," Reliability Engineering and System Safety, Elsevier, vol. 222(C).
    7. Bakeer, Tammam, 2023. "General partial safety factor theory for the assessment of the reliability of nonlinear structural systems," Reliability Engineering and System Safety, Elsevier, vol. 234(C).
    8. Lewis, Austin D. & Groth, Katrina M., 2022. "Metrics for evaluating the performance of complex engineering system health monitoring models," Reliability Engineering and System Safety, Elsevier, vol. 223(C).
    9. Yoichi Matsumoto, 2013. "Heterogeneous Combinations of Knowledge Elements: How the Knowledge Base Structure Impacts Knowledge-related Outcomes of a Firm," Discussion Paper Series DP2013-15, Research Institute for Economics & Business Administration, Kobe University.
    10. Bo, Yimin & Bao, Minglei & Ding, Yi & Hu, Yishuang, 2024. "A DNN-based reliability evaluation method for multi-state series-parallel systems considering semi-Markov process," Reliability Engineering and System Safety, Elsevier, vol. 242(C).
    11. Chehade, Abdallah & Savargaonkar, Mayuresh & Krivtsov, Vasiliy, 2022. "Conditional Gaussian mixture model for warranty claims forecasting," Reliability Engineering and System Safety, Elsevier, vol. 218(PB).
    12. Zheng, Shuwen & Wang, Chong & Zio, Enrico & Liu, Jie, 2024. "Fault detection in complex mechatronic systems by a hierarchical graph convolution attention network based on causal paths," Reliability Engineering and System Safety, Elsevier, vol. 243(C).
    13. Keiji Jindo & Jens A. Andersson & Foluke Quist-Wessel & Jackonia Onyango & Johannes W. A. Langeveld, 2023. "Gendered investment differences among smallholder farmers: evidence from a microcredit programme in western kenya," Food Security: The Science, Sociology and Economics of Food Production and Access to Food, Springer;The International Society for Plant Pathology, vol. 15(6), pages 1489-1504, December.
    14. Zaitseva, Elena & Levashenko, Vitaly & Rabcan, Jan, 2023. "A new method for analysis of Multi-State systems based on Multi-valued decision diagram under epistemic uncertainty," Reliability Engineering and System Safety, Elsevier, vol. 229(C).
    15. Acácio Dom Luís & Rafael Benítez & María del Carmen Bas, 2025. "Bridging Crisp-Set Qualitative Comparative Analysis and Association Rule Mining: A Formal and Computational Integration," Mathematics, MDPI, vol. 13(12), pages 1-28, June.
    16. Man-, ZuyiKeunZuyi Wang & Takagi, Chifumi & Kim, Man-Keun & Chung, Anh, 2022. "Uncover Drivers Influencing Consumers' WTP Using Machine Learning: Case of Organic Coffee in Taiwan," 2022 Annual Meeting, July 31-August 2, Anaheim, California 322150, Agricultural and Applied Economics Association.
    17. Gursel, Ezgi & Madadi, Mahboubeh & Coble, Jamie Baalis & Agarwal, Vivek & Yadav, Vaibhav & Boring, Ronald L. & Khojandi, Anahita, 2025. "The role of AI in detecting and mitigating human errors in safety-critical industries: A review," Reliability Engineering and System Safety, Elsevier, vol. 256(C).
    18. Fallahdizcheh, Amirhossein & Wang, Chao, 2022. "Transfer learning of degradation modeling and prognosis based on multivariate functional analysis with heterogeneous sampling rates," Reliability Engineering and System Safety, Elsevier, vol. 223(C).
    19. Xie, Haipeng & Tang, Lingfeng & Zhu, Hao & Cheng, Xiaofeng & Bie, Zhaohong, 2023. "Robustness assessment and enhancement of deep reinforcement learning-enabled load restoration for distribution systems," Reliability Engineering and System Safety, Elsevier, vol. 237(C).
    20. Bai, Ruxue & Meng, Zong & Xu, Quansheng & Fan, Fengjie, 2023. "Fractional Fourier and time domain recurrence plot fusion combining convolutional neural network for bearing fault diagnosis under variable working conditions," Reliability Engineering and System Safety, Elsevier, vol. 232(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:joinma:v:36:y:2025:i:1:d:10.1007_s10845-023-02245-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.