Imputation Method Based on Collaborative Filtering and Clustering for the Missing Data of the Squeeze Casting Process Parameters

被引:3
|
作者
Deng, Jianxin [1 ,2 ]
Ye, Zhixing [1 ]
Shan, Lubao [1 ]
You, Dongdong [3 ]
Liu, Guangming [2 ]
机构
[1] Guangxi Univ, Guangxi Key Lab Mfg Syst & Adv Mfg Technol, Nanning 530003, Peoples R China
[2] Guangxi Univ, Sch Mech Engn, Nanning 530003, Peoples R China
[3] South China Univ Technol, Natl Engn Res Ctr Near Net Shape Forming Metall M, Guangzhou 510640, Peoples R China
基金
中国国家自然科学基金;
关键词
Squeeze casting; Data-driven materials manufacturing; Missing data; Imputation method; Clustering collaborative filtering; Process data; MULTIPLE IMPUTATION; REGRESSION-MODELS; OPTIMIZATION; VALIDATION; SYSTEM;
D O I
10.1007/s40192-021-00248-x
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The development of a highly efficient methodology for establishing squeeze casting process parameters from past data is essential. However, designing squeeze casting process parameters based on past data is difficult when there are many missing values. Conventional missing data approaches are fraught with additional computational challenges when applied to high-dimensional multivariable missing data, especially material process data with correlation. As the relationship between material composition and process parameters has similar characteristics with that between users and information of interest, this paper proposes a method for missing data imputation based on a clustering-based collaborative filtering (ClubCF) algorithm to address this challenge. Data samples with and without missing values were divided into two groups. K-means clustering based on a canopy algorithm was applied to the data samples without missing values to obtain k subclass data, whose values were then selected to fill data samples with missing values via a collaborative filtering theory based on Pearson similarity user filling. The missing squeeze casting process parameters data of aluminum alloys were used to evaluate the method, and more comparative experiments were carried out to understand their performance and features. Two different indicators, including the mean absolute error and the standard deviation, were utilized to quantify the imputation performance, which was compared with those of three conventional methods (mean interpolation, regression interpolation, and the expectation maximization algorithm). The results indicate that the proposed approach is effective and outperforms conventional methods in processing high-dimensional correlated data.
引用
收藏
页码:95 / 108
页数:14
相关论文
共 50 条
  • [21] A MISSING DATA IMPUTATION METHOD WITH DISTANCE FUNCTION
    Jea, Kuen-Fang
    Hsu, Chin-Wei
    Tang, Li-You
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2018, : 450 - 455
  • [22] A Missing Data Imputation Method Based on Cluster and Spatial Autoregresive Model
    Yang Zhaohui
    Yu Jie
    Chen Jiangping
    EPLWW3S 2011: 2011 INTERNATIONAL CONFERENCE ON ECOLOGICAL PROTECTION OF LAKES-WETLANDS-WATERSHED AND APPLICATION OF 3S TECHNOLOGY, VOL 2, 2011, : 538 - 541
  • [23] A Genetic Programming-Based Imputation Method for Classification with Missing Data
    Cao Truong Tran
    Zhang, Mengjie
    Andreae, Peter
    GENETIC PROGRAMMING, EUROGP 2016, 2016, 9594 : 149 - 163
  • [24] Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering
    Jing Tian
    Bing Yu
    Dan Yu
    Shilong Ma
    Applied Intelligence, 2014, 40 : 376 - 388
  • [25] Study on missing data imputation and modeling for the leaching process
    He, Dakuo
    Wang, Zhengsong
    Yang, Le
    Dai, Wanwan
    CHEMICAL ENGINEERING RESEARCH & DESIGN, 2017, 124 : 1 - 19
  • [26] Process parameters design of squeeze casting through SMR ensemble model and ACO
    Deng, Jianxin
    Wang, Ling
    Liu, Gang
    You, Dongdong
    Wu, Xiusong
    Liang, Jiawei
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2023, 130 (5-6) : 2687 - 2704
  • [27] Multiple imputation for missing data in a longitudinal cohort study: a tutorial based on a detailed case study involving imputation of missing outcome data
    Lee, Katherine J.
    Roberts, Gehan
    Doyle, Lex W.
    Anderson, Peter J.
    Carlin, John B.
    INTERNATIONAL JOURNAL OF SOCIAL RESEARCH METHODOLOGY, 2016, 19 (05) : 575 - 591
  • [28] Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering
    Tian, Jing
    Yu, Bing
    Yu, Dan
    Ma, Shilong
    APPLIED INTELLIGENCE, 2014, 40 (02) : 376 - 388
  • [29] Clustering with missing and left-censored data: A simulation study comparing multiple-imputation-based procedures
    Faucheux, Lilith
    Resche-Rigon, Matthieu
    Curis, Emmanuel
    Soumelis, Vassili
    Chevret, Sylvie
    BIOMETRICAL JOURNAL, 2021, 63 (02) : 372 - 393
  • [30] Optimization of squeeze casting process parameters using Taguchi analysis
    P. Vijian
    V. P. Arunachalam
    The International Journal of Advanced Manufacturing Technology, 2007, 33 : 1122 - 1127