Semi-GAN: An Improved GAN-Based Missing Data Imputation Method for the Semiconductor Industry

被引:12
作者
Lee, Sun-Yong [1 ,2 ]
Connerton, Timothy Paul [1 ]
Lee, Yeon-Woo [3 ]
Kim, Daeyoung [4 ]
Kim, Donghwan [5 ]
Kim, Jin-Ho [6 ]
机构
[1] Business Sch Lausanne, CH-1022 Chavannes, Switzerland
[2] Seoul Sch Integrated Sci & Technol, Seoul 03767, South Korea
[3] Bae Kim & Lee LLC, Seoul 03161, South Korea
[4] Res Inst AIdentyx, San Jose, CA 95134 USA
[5] Res Inst BISTelligence Inc, Seoul 06754, South Korea
[6] Swiss Sch Management, Dept AI & Big Data, CH-6500 Bellinzona, Switzerland
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Generative adversarial networks; Electronics industry; Data models; Training; Support vector machines; Random forests; Neurons; Data imputation; deep learning; fault classification and detection; generative adversarial networks; machine learning; missing data; semiconductor equipment; MULTIPLE IMPUTATION; CHAINED EQUATIONS; VALUES;
D O I
10.1109/ACCESS.2022.3188871
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Complete data are required for the operation, maintenance, and detection of faults in semiconductor equipment. Missing data occur frequently because of defects such as sensor, data storage, and communication faults, leading to reductions in yield, quality, and productivity. Although many attempts have been made to solve this problem in other fields, few studies have specifically addressed data imputation in the semiconductor industry. In this study, an improved generative adversarial network (GAN)-based missing data imputation for the semiconductor industry called Semi-GAN is proposed. This study introduces a machine learning approach for dealing with data imputation in the semiconductor industry. The proposed method was applied to real data and evaluated using traditional techniques. In particular, the proposed method showed excellent results compared to traditional attribution methods when all missing data ratios in the experiments were less than 20%. It was also observed to be superior when simple and repetitive patterns were omitted rather than repetitive but not simple patterns.
引用
收藏
页码:72328 / 72338
页数:11
相关论文
共 37 条
  • [1] Multiple imputation by chained equations: what is it and how does it work?
    Azur, Melissa J.
    Stuart, Elizabeth A.
    Frangakis, Constantine
    Leaf, Philip J.
    [J]. INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) : 40 - 49
  • [2] Batista G., 2003, Experimental comparison of K-nearest neighbor and mean or mode imputation methods with the internal strategies used by C4.5 and CN2 to treat missing data, V34
  • [3] Bengio Y, 1996, ADV NEUR IN, V8, P395
  • [4] Bi J., 2005, P ADV NEUR INF PROC, P1
  • [5] Chen YJ, 2017, IEEE INT CON AUTO SC, P731, DOI 10.1109/COASE.2017.8256190
  • [6] A data mining approach for analyzing semiconductor MES and FDC data to enhance overall usage effectiveness (OUE)
    Chien, Chen-Fu
    Diaz, Alejandra Campero
    Lan, Yu-Bin
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2014, 7 : 52 - 65
  • [7] Iterative Robust Semi-Supervised Missing Data Imputation
    Fazakis, Nikos
    Kostopoulos, Georgios
    Kotsiantis, Sotiris
    Mporas, Iosif
    [J]. IEEE ACCESS, 2020, 8 : 90555 - 90569
  • [8] Self-organising map for data imputation and correction in surveys
    Fessant, F
    Midenet, S
    [J]. NEURAL COMPUTING & APPLICATIONS, 2002, 10 (04) : 300 - 310
  • [9] Pattern classification with missing data: a review
    Garcia-Laencina, Pedro J.
    Sancho-Gomez, Jose-Luis
    Figueiras-Vidal, Anibal R.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2010, 19 (02) : 263 - 282
  • [10] Generative Adversarial Networks
    Goodfellow, Ian
    Pouget-Abadie, Jean
    Mirza, Mehdi
    Xu, Bing
    Warde-Farley, David
    Ozair, Sherjil
    Courville, Aaron
    Bengio, Yoshua
    [J]. COMMUNICATIONS OF THE ACM, 2020, 63 (11) : 139 - 144