IDEAS home Printed from https://ideas.repec.org/a/gam/jeners/v12y2019i13p2530-d244639.html
   My bibliography  Save this article

Effect of Irrelevant Variables on Faulty Wafer Detection in Semiconductor Manufacturing

Author

Listed:
  • Dongil Kim

    (Department of Computer Science & Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Korea)

  • Seokho Kang

    (Department of Systems Management Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon 16419, Korea)

Abstract

Machine learning has been applied successfully for faulty wafer detection tasks in semiconductor manufacturing. For the tasks, prediction models are built with prior data to predict the quality of future wafers as a function of their precedent process parameters and measurements. In real-world problems, it is common for the data to have a portion of input variables that are irrelevant to the prediction of an output variable. The inclusion of many irrelevant variables negatively affects the performance of prediction models. Typically, prediction models learned by different learning algorithms exhibit different sensitivities with regard to irrelevant variables. Algorithms with low sensitivities are preferred as a first trial for building prediction models, whereas a variable selection procedure is necessarily considered for highly sensitive algorithms. In this study, we investigate the effect of irrelevant variables on three well-known representative learning algorithms that can be applied to both classification and regression tasks: artificial neural network, decision tree (DT), and k -nearest neighbors ( k -NN). We analyze the characteristics of these learning algorithms in the presence of irrelevant variables with different model complexity settings. An empirical analysis is performed using real-world datasets collected from a semiconductor manufacturer to examine how the number of irrelevant variables affects the behavior of prediction models trained with different learning algorithms and model complexity settings. The results indicate that the prediction accuracy of k -NN is highly degraded, whereas DT demonstrates the highest robustness in the presence of many irrelevant variables. In addition, a higher model complexity of learning algorithms leads to a higher sensitivity to irrelevant variables.

Suggested Citation

  • Dongil Kim & Seokho Kang, 2019. "Effect of Irrelevant Variables on Faulty Wafer Detection in Semiconductor Manufacturing," Energies, MDPI, vol. 12(13), pages 1-11, July.
  • Handle: RePEc:gam:jeners:v:12:y:2019:i:13:p:2530-:d:244639
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1996-1073/12/13/2530/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1996-1073/12/13/2530/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Goldstein, William M. & Busemeyer, Jerome R., 1992. "The effect of "irrelevant" variables on decision making: Criterion shifts in preferential choice?," Organizational Behavior and Human Decision Processes, Elsevier, vol. 52(3), pages 425-454, August.
    2. Fomby, Thomas B., 1981. "Loss of efficiency in regression analysis due to irrelevant variables : A generalization," Economics Letters, Elsevier, vol. 7(4), pages 319-322.
    3. Wei-Yin Loh, 2014. "Fifty Years of Classification and Regression Trees," International Statistical Review, International Statistical Institute, vol. 82(3), pages 329-348, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hugo Siqueira & Mariana Macedo & Yara de Souza Tadano & Thiago Antonini Alves & Sergio L. Stevan & Domingos S. Oliveira & Manoel H.N. Marinho & Paulo S.G. de Mattos Neto & João F. L. de Oliveira & Ive, 2020. "Selection of Temporal Lags for Predicting Riverflow Series from Hydroelectric Plants Using Variable Selection Methods," Energies, MDPI, vol. 13(16), pages 1-35, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Miljkovic, Dragan & Gong, Jian & Lehrke, Linda, 2009. "The Effects of Trivial Attributes on Choice of Food Products," Agricultural and Resource Economics Review, Cambridge University Press, vol. 38(2), pages 142-152, October.
    2. Yan, Ran & Wang, Shuaian & Du, Yuquan, 2020. "Development of a two-stage ship fuel consumption prediction and reduction model for a dry bulk ship," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 138(C).
    3. Eduardo Rodríguez Sánchez & Eduardo Filemón Vázquez Santacruz & Humberto Cervantes Maceda, 2023. "Effort and Cost Estimation Using Decision Tree Techniques and Story Points in Agile Software Development," Mathematics, MDPI, vol. 11(6), pages 1-31, March.
    4. Olga Takacs & Janos Vincze, 2018. "The within-job gender pay gap in Hungary," CERS-IE WORKING PAPERS 1834, Institute of Economics, Centre for Economic and Regional Studies.
    5. Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
    6. Jingfang Liu & Mengshi Shi & Huihong Jiang, 2022. "Detecting Suicidal Ideation in Social Media: An Ensemble Method Based on Feature Fusion," IJERPH, MDPI, vol. 19(13), pages 1-13, July.
    7. Tai, Chung-Ching & Lin, Hung-Wen & Chie, Bin-Tzong & Tung, Chen-Yuan, 2019. "Predicting the failures of prediction markets: A procedure of decision making using classification models," International Journal of Forecasting, Elsevier, vol. 35(1), pages 297-312.
    8. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    9. Evan B Brooks & John W Coulston & Kurt H Riitters & David N Wear, 2020. "Using a hybrid demand-allocation algorithm to enable distributional analysis of land use change patterns," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-21, October.
    10. Nan-Ting Liu & Feng-Chang Lin & Yu-Shan Shih, 2020. "Count regression trees," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(1), pages 5-27, March.
    11. Silva, Allyson & Roodbergen, Kees Jan & Coelho, Leandro C. & Darvish, Maryam, 2022. "Estimating optimal ABC zone sizes in manual warehouses," International Journal of Production Economics, Elsevier, vol. 252(C).
    12. Zhen Li & Jie Chen & Eric Laber & Fang Liu & Richard Baumgartner, 2023. "Optimal Treatment Regimes: A Review and Empirical Comparison," International Statistical Review, International Statistical Institute, vol. 91(3), pages 427-463, December.
    13. Katelyn Battista & Karen A. Patte & Liqun Diao & Joel A. Dubin & Scott T. Leatherdale, 2022. "Using Decision Trees to Examine Environmental and Behavioural Factors Associated with Youth Anxiety, Depression, and Flourishing," IJERPH, MDPI, vol. 19(17), pages 1-16, August.
    14. Guitouni, Adel & Martel, Jean-Marc, 1998. "Tentative guidelines to help choosing an appropriate MCDA method," European Journal of Operational Research, Elsevier, vol. 109(2), pages 501-521, September.
    15. Linwei Hu & Jie Chen & Joel Vaughan & Soroush Aramideh & Hanyu Yang & Kelly Wang & Agus Sudjianto & Vijayan N. Nair, 2021. "Supervised Machine Learning Techniques: An Overview with Applications to Banking," International Statistical Review, International Statistical Institute, vol. 89(3), pages 573-604, December.
    16. James Rodway & Petr Musilek, 2017. "Harvesting-Aware Energy Management for Environmental Monitoring WSN," Energies, MDPI, vol. 10(5), pages 1-19, May.
    17. Xiaolin Yang & Yini Fan & Dawei Xia & Yukai Zou & Yuwen Deng, 2023. "Elderly Residents’ Uses of and Preferences for Community Outdoor Spaces during Heat Periods," Sustainability, MDPI, vol. 15(14), pages 1-20, July.
    18. Farkas, Sébastien & Lopez, Olivier & Thomas, Maud, 2021. "Cyber claim analysis using Generalized Pareto regression trees with applications to insurance," Insurance: Mathematics and Economics, Elsevier, vol. 98(C), pages 92-105.
    19. Emilio Aguirre & Federico García-Suárez & Gabriela Sicilia, 2021. "Eficiencia técnica en la ganadería de carne bovina pastoril. Medición y exploración de sus determinantes en Uruguay," Documentos de Trabajo (working papers) 1321, Department of Economics - dECON.
    20. Gonzalez-Vallejo, Claudia & Moran, Elizabeth, 2001. "The Evaluability Hypothesis Revisited: Joint and Separate Evaluation Preference Reversal as a Function of Attribute Importance," Organizational Behavior and Human Decision Processes, Elsevier, vol. 86(2), pages 216-233, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jeners:v:12:y:2019:i:13:p:2530-:d:244639. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.