Imputation Method Based on Collaborative Filtering and Clustering for the Missing Data of the Squeeze Casting Process Parameters

被引：3

作者：

Deng, Jianxin ^{[1
,2
]}

Ye, Zhixing ^{[1
]}

Shan, Lubao ^{[1
]}

You, Dongdong ^{[3
]}

Liu, Guangming ^{[2
]}

机构：

[1] Guangxi Univ, Guangxi Key Lab Mfg Syst & Adv Mfg Technol, Nanning 530003, Peoples R China

[2] Guangxi Univ, Sch Mech Engn, Nanning 530003, Peoples R China

[3] South China Univ Technol, Natl Engn Res Ctr Near Net Shape Forming Metall M, Guangzhou 510640, Peoples R China

来源：

INTEGRATING MATERIALS AND MANUFACTURING INNOVATION | 2022年 / 11卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Squeeze casting; Data-driven materials manufacturing; Missing data; Imputation method; Clustering collaborative filtering; Process data; MULTIPLE IMPUTATION; REGRESSION-MODELS; OPTIMIZATION; VALIDATION; SYSTEM;

D O I：

10.1007/s40192-021-00248-x

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

The development of a highly efficient methodology for establishing squeeze casting process parameters from past data is essential. However, designing squeeze casting process parameters based on past data is difficult when there are many missing values. Conventional missing data approaches are fraught with additional computational challenges when applied to high-dimensional multivariable missing data, especially material process data with correlation. As the relationship between material composition and process parameters has similar characteristics with that between users and information of interest, this paper proposes a method for missing data imputation based on a clustering-based collaborative filtering (ClubCF) algorithm to address this challenge. Data samples with and without missing values were divided into two groups. K-means clustering based on a canopy algorithm was applied to the data samples without missing values to obtain k subclass data, whose values were then selected to fill data samples with missing values via a collaborative filtering theory based on Pearson similarity user filling. The missing squeeze casting process parameters data of aluminum alloys were used to evaluate the method, and more comparative experiments were carried out to understand their performance and features. Two different indicators, including the mean absolute error and the standard deviation, were utilized to quantify the imputation performance, which was compared with those of three conventional methods (mean interpolation, regression interpolation, and the expectation maximization algorithm). The results indicate that the proposed approach is effective and outperforms conventional methods in processing high-dimensional correlated data.

引用

页码：95 / 108

页数：14

共 50 条

[41] A novel clustering-based purity and distance imputation for handling medical data with missing values
Cheng, Ching-Hsue
Huang, Shu-Fen
SOFT COMPUTING, 2021, 25 (17) : 11781 - 11801
[42] Impact of missing data imputation methods on gene expression clustering and classification
Marcilio CP de Souto
Pablo A Jaskowiak
Ivan G Costa
BMC Bioinformatics, 16
[43] Robust imputation method for missing values in microarray data
Dankyu Yoon
Eun-Kyung Lee
Taesung Park
BMC Bioinformatics, 8
[44] Missing data imputation in meteorological datasets with the GAIN method
Popolizio, Marina
Amato, Alberto
Politi, Tiziano
Calienno, Roberto
Di Lecce, Vincenzo
2021 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR INDUSTRY 4.0 & IOT (IEEE METROIND4.0 & IOT), 2021, : 556 - 560
[45] Approximate Imputation Method for Missing Data in Machine Learning
Cao W.
Chu Y.
Li X.
1600, Xi'an Jiaotong University (51): : 142 - 148
[46] A robust missing value imputation method for noisy data
Zhu, Bing
He, Changzheng
Liatsis, Panos
APPLIED INTELLIGENCE, 2012, 36 (01) : 61 - 74
[47] Semi-GAN: An Improved GAN-Based Missing Data Imputation Method for the Semiconductor Industry
Lee, Sun-Yong
Connerton, Timothy Paul
Lee, Yeon-Woo
Kim, Daeyoung
Kim, Donghwan
Kim, Jin-Ho
IEEE ACCESS, 2022, 10 : 72328 - 72338
[48] Effects of process parameters on quality of squeeze casting A356 alloy
Chang, Q. M.
Chen, C. J.
Zhang, S. C.
Schwam, D.
Wallace, J. F.
INTERNATIONAL JOURNAL OF CAST METALS RESEARCH, 2010, 23 (01) : 30 - 36
[49] Latent class based multiple imputation approach for missing categorical data
Gebregziabher, Mulugeta
DeSantis, Stacia M.
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2010, 140 (11) : 3252 - 3262
[50] Should multiple imputation be the method of choice for handling missing data in randomized trials?
Sullivan, Thomas R.
White, Ian R.
Salter, Amy B.
Ryan, Philip
Lee, Katherine J.
STATISTICAL METHODS IN MEDICAL RESEARCH, 2018, 27 (09) : 2610 - 2626

← 1 2 3 4 5 →