Imbalanced classification of manufacturing quality conditions using cost-sensitive decision tree ensembles

被引:43
作者
Kim, Aekyung [1 ]
Oh, Kyuhyup [1 ]
Jung, Jae-Yoon [1 ]
Kim, Bohyun [2 ]
机构
[1] Kyung Hee Univ, Dept Ind & Management Syst Engn, Yongin, South Korea
[2] Korea Inst Ind Technol, IT Converged Proc R&D Grp, Ansan, South Korea
关键词
Imbalanced classification; manufacturing quality condition classification; decision tree ensemble; cost-sensitive ensemble classification; die-casting quality analysis; DIE-CASTING PROCESS; ARTIFICIAL NEURAL-NETWORK; PROCESS PARAMETERS; SURFACE-ROUGHNESS; GENETIC ALGORITHM; PREDICTION; OPTIMIZATION; SYSTEM; DEFECT; MACHINE;
D O I
10.1080/0951192X.2017.1407447
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data-driven quality control techniques are being actively developed for implementation in smart factories. Quality prediction during manufacturing processes is a good example of how big data analytics can influence advanced manufacturing environments. In this paper, the problem of classifying manufacturing process conditions into normal and defective products according to defect types is dealt with. Such a quality analysis data set is generally unbalanced because the defective rate is quite low in practice. To solve this imbalanced classification problem, a cost-sensitive decision tree ensemble algorithm is adopted to boost the small number of defective cases and assign a higher cost to the misclassification of defective products than that of normal products. C4.5 decision trees are used as base classifiers, and three cost-sensitive ensembles, AdaC1, AdaC2 and AdaC3, are tried to address the imbalanced classification. A few types of defect conditions in a real-world die-casting data set were predicted through the proposed methods. In these experiments, the cost-sensitive ensembles were able to classify the imbalanced data and detect the defect conditions more precisely and more exactly than 19 algorithms in other classification categories such as classic classifiers and ensembles, cost-sensitive single classifiers and sampling-based ensembles. Especially, the AdaC2-based method mainly outperformed all other classification algorithms in terms of performance measures such as F-measure, G-means and AUC for the die-casting quality condition classification problem.
引用
收藏
页码:701 / 717
页数:17
相关论文
共 60 条
  • [1] Time-series clustering - A decade review
    Aghabozorgi, Saeed
    Shirkhorshidi, Ali Seyed
    Teh Ying Wah
    [J]. INFORMATION SYSTEMS, 2015, 53 : 16 - 38
  • [2] ANN-based prediction of surface and hole quality in drilling of AISI D2 cold work tool steel
    Akincioglu, Sitki
    Mendi, Faruk
    Cicek, Adem
    Akincioglu, Gulsah
    [J]. INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2013, 68 (1-4) : 197 - 207
  • [3] [Anonymous], 2004, ACM SIGKDD Explor. Newsl.
  • [4] [Anonymous], INDUSTRIELLEN REVOLU
  • [5] An intelligent system approach for surface roughness and vibrations prediction in cylindrical grinding
    Asilturk, Ilhan
    Tinkir, Mustafa
    El Monuayri, Hazim
    Celik, Levent
    [J]. INTERNATIONAL JOURNAL OF COMPUTER INTEGRATED MANUFACTURING, 2012, 25 (08) : 750 - 759
  • [6] A bit level representation for time series data mining with shape based similarity
    Bagnall, Anthony
    Ratanamahatana, Chotirat 'Ann'
    Keogh, Eamonn
    Lonardi, Stefano
    Janacek, Gareth
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2006, 13 (01) : 11 - 40
  • [7] New applications of ensembles of classifiers
    Barandela, R
    Sánchez, JS
    Valdovinos, RM
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2003, 6 (03) : 245 - 256
  • [8] Binu Bose V., 2013, INT J INNOVATIVE RES, V2, P589
  • [9] SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
    Blewitt, Marnie E.
    Gendrel, Anne-Valerie
    Pang, Zhenyi
    Sparrow, Duncan B.
    Whitelaw, Nadia
    Craig, Jeffrey M.
    Apedaile, Anwyn
    Hilton, Douglas J.
    Dunwoodie, Sally L.
    Brockdorff, Neil
    Kay, Graham F.
    Whitelaw, Emma
    [J]. NATURE GENETICS, 2008, 40 (05) : 663 - 669
  • [10] The use of the area under the roc curve in the evaluation of machine learning algorithms
    Bradley, AP
    [J]. PATTERN RECOGNITION, 1997, 30 (07) : 1145 - 1159