Predicting the Performance of Ensemble Classification Using Conditional Joint Probability

被引:1
|
作者
Murtza, Iqbal [1 ,2 ]
Kim, Jin-Young [3 ]
Adnan, Muhammad [4 ]
机构
[1] Chonnam Natl Univ, Educ & Res Ctr IoT Convergence Intelligent City Sa, Gwangju 61186, South Korea
[2] Air Univ, Fac Comp & AI, Dept Creat Technol, Islamabad 44230, Pakistan
[3] Chonnam Natl Univ, Dept Intelligent Elect & Comp Engn, Gwangju 61186, South Korea
[4] UiT Arctic Univ Norway, Dept Technol & Safety, N-9019 Tromso, Norway
基金
新加坡国家研究基金会;
关键词
machine learning; probability theory; ensemble classification; cost-sensitive learning; binary classification;
D O I
10.3390/math12162586
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In many machine learning applications, there are many scenarios when performance is not satisfactory by single classifiers. In this case, an ensemble classification is constructed using several weak base learners to achieve satisfactory performance. Unluckily, the construction of the ensemble classification is empirical, i.e., to try an ensemble classification and if performance is not satisfactory then discard it. In this paper, a challenging analytical problem of the estimation of ensemble classification using the prediction performance of the base learners is considered. The proposed formulation is aimed at estimating the performance of ensemble classification without physically developing it, and it is derived from the perspective of probability theory by manipulating the decision probabilities of the base learners. For this purpose, the output of a base learner (which is either true positive, true negative, false positive, or false negative) is considered as a random variable. Then, the effects of logical disjunction-based and majority voting-based decision combination strategies are analyzed from the perspective of conditional joint probability. To evaluate the forecasted performance of ensemble classifier by the proposed methodology, publicly available standard datasets have been employed. The results show the effectiveness of the derived formulations to estimate the performance of ensemble classification. In addition to this, the theoretical and experimental results show that the logical disjunction-based decision outperforms majority voting in imbalanced datasets and cost-sensitive scenarios.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Ensemble classification for predicting the malignancy level of pulmonary nodules on chest computed tomography images
    Xiao, Ning
    Qiang, Yan
    Zia, Muhammad Bilal
    Wang, Sanhu
    Lian, Jianhong
    ONCOLOGY LETTERS, 2020, 20 (01) : 401 - 408
  • [32] Predicting the Impact of Construction Rework Cost Using an Ensemble Classifier
    Mostofi, Fatemeh
    Togan, Vedat
    Ayozen, Yunus Emre
    Tokdemir, Onur Behzat
    SUSTAINABILITY, 2022, 14 (22)
  • [33] Predicting Stock prices using Ensemble Learning and Sentiment Analysis
    Pasupulety, Ujjwal
    Anees, Aiman Abdullah
    Anmol, Subham
    Mohan, Biju R.
    2019 IEEE SECOND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE), 2019, : 215 - 222
  • [34] Predicting carbonation depth of concrete using a hybrid ensemble model
    Huo, Zehui
    Wang, Ling
    Huang, Yimiao
    JOURNAL OF BUILDING ENGINEERING, 2023, 76
  • [35] Malware Classification Using Probability Scoring and Machine Learning
    Xue, Di
    Li, Jingmei
    Lv, Tu
    Wu, Weifei
    Wang, Jiaxiang
    IEEE ACCESS, 2019, 7 : 91641 - 91656
  • [36] Predicting Academic Performance Based on Students' Family Environment: Evidence for Colombia Using Classification Trees
    David Garcia-Gonzalez, Juan
    Skrita, Anastasija
    PSYCHOLOGY SOCIETY & EDUCATION, 2019, 11 (03): : 299 - 311
  • [37] RNA Family Classification Using the Conditional Random Fields Model
    Subpaiboonkit, Sitthichoke
    Thammarongtham, Chinae
    Chaijaruwanich, Jeerayut
    CHIANG MAI JOURNAL OF SCIENCE, 2012, 39 (01): : 1 - 7
  • [38] Predicting Vectorization Profitability Using Binary Classification
    Trouve, Antoine
    Cruz, Arnaldo J.
    Ben Brahim, Dhouha
    Fukuyama, Hiroki
    Murakami, Kazuaki J.
    Clarke, Hadrien
    Arai, Masaki
    Nakahira, Tadashi
    Yamanaka, Eiji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (12) : 3124 - 3132
  • [39] Classification of Drivers' Workload Using Physiological Signals in Conditional Automation
    Meteier, Quentin
    Capallera, Marine
    Ruffieux, Simon
    Angelini, Leonardo
    Abou Khaled, Omar
    Mugellini, Elena
    Widmer, Marino
    Sonderegger, Andreas
    FRONTIERS IN PSYCHOLOGY, 2021, 12
  • [40] Cost-sensitive probability for weighted voting in an ensemble model for multi-class classification problems
    Rojarath, Artittayapron
    Songpan, Wararat
    APPLIED INTELLIGENCE, 2021, 51 (07) : 4908 - 4932