Accurate multi-criteria decision making methodology for recommending machine learning algorithm

被引:109
作者
Ali, Rahman [1 ,2 ]
Lee, Sungyoung [1 ]
Chung, Tae Choong [1 ]
机构
[1] Kyung Hee Univ Global Campus, Dept Comp Engn, 1732,Deogyeong Daero, Yongin 17104, Gyeonggi Do, South Korea
[2] Commerce Univ Peshawar, Quaid E Azam Coll, Peshawar 25120, Khyber Pakhtunk, Pakistan
基金
新加坡国家研究基金会;
关键词
Multi-criteria decision making; Algorithm recommendation; Algorithm selection; Classification algorithms; Classifiers recommendation; TOPSIS; Ranking classifiers; PERFORMANCE ANALYSIS; RANK; SELECTION; TIME; MCDM; AHP;
D O I
10.1016/j.eswa.2016.11.034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: Manual evaluation of machine learning algorithms and selection of a suitable classifier from the list of available candidate classifiers, is highly time consuming and challenging task. If the selection is not carefully and accurately done, the resulting classification model will not be able to produce the expected performance results. In this study, we present an accurate multi-criteria decision making methodology (AMD) which empirically evaluates and ranks classifiers' and allow end users or experts to choose the top ranked classifier for their applications to learn and build classification models for them. Methods and material: Existing classifiers performance analysis and recommendation methodologies lack (a) appropriate method for suitable evaluation criteria selection, (b) relative consistent weighting mechanism, (c) fitness assessment of the classifiers' performances, and (d) satisfaction of various constraints during the analysis process. To assist machine learning practitioners in the selection of suitable classifier(s), AMD methodology is proposed that presents an expert group-based criteria selection method, relative consistent weighting scheme, a new ranking method, called optimum performance ranking criteria, based on multiple evaluation metrics, statistical significance and fitness assessment functions, and implicit and explicit constraints satisfaction at the time of analysis. For ranking the classifiers performance, the proposed ranking method integrates Wgt.Avg.F-score, CPUTimeTesting, CPUTimeTraining, and Consistency measures using the technique for order performance by similarity to ideal solution (TOPSIS). The final relative closeness score produced by TOPSIS, is ranked and the practitioners select the best performance (top-ranked) classifier for their problems in-hand. Findings: Based on the extensive experiments performed on 15 publically available UCI and OpenML datasets using 35 classification algorithms from heterogeneous families of classifiers, an average Spear man's rank correlation coefficient of 0.98 is observed. Similarly, the AMD method has showed improved performance of 0.98 average Spearman's rank correlation coefficient as compared to 0.83 and 0.045 correlation coefficient of the state-of-the-art ranking methods, performance of algorithms (PA1g) and adjusted ratio of ratio (ARR). Conclusion and implication: The evaluation, empirical analysis of results and comparison with state-of-the-art methods demonstrate the feasibility of AMD methodology, especially the selection and weighting of right evaluation criteria, accurate ranking and selection of optimum performance classifier(s) for the user's application's data in hand. AMD reduces expert's time and efforts and improves system performance by designing suitable classifier recommended by AMD methodology. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:257 / 278
页数:22
相关论文
共 70 条
  • [1] AHA DW, 1992, MACHINE LEARNING /, P1
  • [2] Alexandros K., 2001, International Journal on Artificial Intelligence Tools (Architectures, Languages, Algorithms), V10, P525, DOI 10.1142/S0218213001000647
  • [3] On learning algorithm selection for classification
    Ali, S
    Smith, KA
    [J]. APPLIED SOFT COMPUTING, 2006, 6 (02) : 119 - 138
  • [4] A meta-learning approach to automatic kernel selection for support vector machines
    Ali, Shawkat
    Smith-Miles, Kate A.
    [J]. NEUROCOMPUTING, 2006, 70 (1-3) : 173 - 186
  • [5] Measure-based classifier performance evaluation
    Andersson, A
    Davidsson, P
    Lindén, J
    [J]. PATTERN RECOGNITION LETTERS, 1999, 20 (11-13) : 1165 - 1173
  • [6] [Anonymous], MULTIPLE CRITERIA DE
  • [7] [Anonymous], P 24 ANN WORKSH SWED
  • [8] Bache K., 2013, UCI Machine Learning Repository
  • [9] Berrer H, 2000, P PKDD 2000 WORKSH D
  • [10] Bouckaert RR, 2010, J MACH LEARN RES, V11, P2533