Semi-GAN: An Improved GAN-Based Missing Data Imputation Method for the Semiconductor Industry

被引：14

作者：

Lee, Sun-Yong ^{[1
,2
]}

Connerton, Timothy Paul ^{[1
]}

Lee, Yeon-Woo ^{[3
]}

Kim, Daeyoung ^{[4
]}

Kim, Donghwan ^{[5
]}

Kim, Jin-Ho ^{[6
]}

机构：

[1] Business Sch Lausanne, CH-1022 Chavannes, Switzerland

[2] Seoul Sch Integrated Sci & Technol, Seoul 03767, South Korea

[3] Bae Kim & Lee LLC, Seoul 03161, South Korea

[4] Res Inst AIdentyx, San Jose, CA 95134 USA

[5] Res Inst BISTelligence Inc, Seoul 06754, South Korea

[6] Swiss Sch Management, Dept AI & Big Data, CH-6500 Bellinzona, Switzerland

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Generative adversarial networks; Electronics industry; Data models; Training; Support vector machines; Random forests; Neurons; Data imputation; deep learning; fault classification and detection; generative adversarial networks; machine learning; missing data; semiconductor equipment; MULTIPLE IMPUTATION; CHAINED EQUATIONS; VALUES;

D O I：

10.1109/ACCESS.2022.3188871

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Complete data are required for the operation, maintenance, and detection of faults in semiconductor equipment. Missing data occur frequently because of defects such as sensor, data storage, and communication faults, leading to reductions in yield, quality, and productivity. Although many attempts have been made to solve this problem in other fields, few studies have specifically addressed data imputation in the semiconductor industry. In this study, an improved generative adversarial network (GAN)-based missing data imputation for the semiconductor industry called Semi-GAN is proposed. This study introduces a machine learning approach for dealing with data imputation in the semiconductor industry. The proposed method was applied to real data and evaluated using traditional techniques. In particular, the proposed method showed excellent results compared to traditional attribution methods when all missing data ratios in the experiments were less than 20%. It was also observed to be superior when simple and repetitive patterns were omitted rather than repetitive but not simple patterns.

引用

页码：72328 / 72338

页数：11

共 37 条

[1] Multiple imputation by chained equations: what is it and how does it work? [J].

Azur, Melissa J. ;

Stuart, Elizabeth A. ;

Frangakis, Constantine ;

Leaf, Philip J. .

INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) :40-49

[2]

Batista G., 2003, Experimental comparison of K-nearest neighbor and mean or mode imputation methods with the internal strategies used by C4.5 and CN2 to treat missing data, V34

[3]

Bengio Y, 1996, ADV NEUR IN, V8, P395

[4]

Bi J., 2005, P ADV NEUR INF PROC, P1

[5]

Chen YJ, 2017, IEEE INT CON AUTO SC, P731, DOI 10.1109/COASE.2017.8256190

[6] A data mining approach for analyzing semiconductor MES and FDC data to enhance overall usage effectiveness (OUE) [J].

Chien, Chen-Fu ;

Diaz, Alejandra Campero ;

Lan, Yu-Bin .

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2014, 7 :52-65

[7] Iterative Robust Semi-Supervised Missing Data Imputation [J].

Fazakis, Nikos ;

Kostopoulos, Georgios ;

Kotsiantis, Sotiris ;

Mporas, Iosif .

IEEE ACCESS, 2020, 8 :90555-90569

[8] Self-organising map for data imputation and correction in surveys [J].

Fessant, F ;

Midenet, S .

NEURAL COMPUTING & APPLICATIONS, 2002, 10 (04) :300-310

[9] Pattern classification with missing data: a review [J].

Garcia-Laencina, Pedro J. ;

Sancho-Gomez, Jose-Luis ;

Figueiras-Vidal, Anibal R. .

NEURAL COMPUTING & APPLICATIONS, 2010, 19 (02) :263-282

[10] Generative Adversarial Networks [J].

Goodfellow, Ian ;

Pouget-Abadie, Jean ;

Mirza, Mehdi ;

Xu, Bing ;

Warde-Farley, David ;

Ozair, Sherjil ;

Courville, Aaron ;

Bengio, Yoshua .

COMMUNICATIONS OF THE ACM, 2020, 63 (11) :139-144

← 1 2 3 4 →