Imputation Method Based on Collaborative Filtering and Clustering for the Missing Data of the Squeeze Casting Process Parameters

被引:3
|
作者
Deng, Jianxin [1 ,2 ]
Ye, Zhixing [1 ]
Shan, Lubao [1 ]
You, Dongdong [3 ]
Liu, Guangming [2 ]
机构
[1] Guangxi Univ, Guangxi Key Lab Mfg Syst & Adv Mfg Technol, Nanning 530003, Peoples R China
[2] Guangxi Univ, Sch Mech Engn, Nanning 530003, Peoples R China
[3] South China Univ Technol, Natl Engn Res Ctr Near Net Shape Forming Metall M, Guangzhou 510640, Peoples R China
基金
中国国家自然科学基金;
关键词
Squeeze casting; Data-driven materials manufacturing; Missing data; Imputation method; Clustering collaborative filtering; Process data; MULTIPLE IMPUTATION; REGRESSION-MODELS; OPTIMIZATION; VALIDATION; SYSTEM;
D O I
10.1007/s40192-021-00248-x
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The development of a highly efficient methodology for establishing squeeze casting process parameters from past data is essential. However, designing squeeze casting process parameters based on past data is difficult when there are many missing values. Conventional missing data approaches are fraught with additional computational challenges when applied to high-dimensional multivariable missing data, especially material process data with correlation. As the relationship between material composition and process parameters has similar characteristics with that between users and information of interest, this paper proposes a method for missing data imputation based on a clustering-based collaborative filtering (ClubCF) algorithm to address this challenge. Data samples with and without missing values were divided into two groups. K-means clustering based on a canopy algorithm was applied to the data samples without missing values to obtain k subclass data, whose values were then selected to fill data samples with missing values via a collaborative filtering theory based on Pearson similarity user filling. The missing squeeze casting process parameters data of aluminum alloys were used to evaluate the method, and more comparative experiments were carried out to understand their performance and features. Two different indicators, including the mean absolute error and the standard deviation, were utilized to quantify the imputation performance, which was compared with those of three conventional methods (mean interpolation, regression interpolation, and the expectation maximization algorithm). The results indicate that the proposed approach is effective and outperforms conventional methods in processing high-dimensional correlated data.
引用
收藏
页码:95 / 108
页数:14
相关论文
共 50 条
  • [31] Evolving Clustering Based Data Imputation
    Gautam, Chandan
    Ravi, Vadlamani
    2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1763 - 1769
  • [32] Imputation Method of Random Arbitrary Missing Data Based on Improved Close Degree of Grey Incidence
    Liu, Guodong
    Zhu, Jianjun
    Liu, Xiaodi
    JOURNAL OF GREY SYSTEM, 2019, 31 (02) : 74 - 97
  • [33] Optimization of squeeze casting process parameters using Taguchi analysis
    Vijian, P.
    Arunachalam, V. P.
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2007, 33 (11-12) : 1122 - 1127
  • [34] Regression-based imputation of explanatory discrete missing data
    Hernandez-Herrera, Gilma
    Navarro, Albert
    Morina, David
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024, 53 (09) : 4363 - 4379
  • [35] Kernel-based multi-imputation for missing data
    Zhang, Shichao
    Qin, Yongsong
    Zhu, Xiaofeng
    Zhang, Jilian
    Zhang, Chengqi
    ADVANCES IN INTELLIGENT IT: ACTIVE MEDIA TECHNOLOGY 2006, 2006, 138 : 106 - +
  • [36] Composite Imputation Method for the Multiple Linear Regression with Missing at Random Data
    Thongsri, Thidarat
    Samart, Klairung
    INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE, 2022, 17 (01) : 51 - 62
  • [37] Effects of squeeze casting parameters on solidification time based on neural network
    Wang, Rong Ji
    Tan, Wen Fang
    Zhou, Dian-Wu
    INTERNATIONAL JOURNAL OF MATERIALS & PRODUCT TECHNOLOGY, 2013, 46 (2-3) : 124 - 140
  • [38] Impact of missing data imputation methods on gene expression clustering and classification
    de Souto, Marcilio C. P.
    Jaskowiak, Pablo A.
    Costa, Ivan G.
    BMC BIOINFORMATICS, 2015, 16
  • [39] A new iterative fuzzy clustering algorithm for multiple imputation of missing data
    Nikfalazar, Sanaz
    Yeh, Chung-Hsing
    Bedingfield, Susan
    Khorshidi, Hadi A.
    2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [40] A Missing Data Imputation Approach Using Clustering and Maximum Likelihood Estimation
    Albayrak, Muammer
    Turhan, Kemal
    Kurt, Burcin
    2017 MEDICAL TECHNOLOGIES NATIONAL CONGRESS (TIPTEKNO), 2017,