Increasing the effectiveness of associative classification in terms of class imbalance by using a novel pruning algorithm

被引:13
|
作者
Chen, Wen-Chin [1 ]
Hsu, Chiun-Chieh [1 ]
Chu, Yu-Chun [1 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Informat Management, Taipei, Taiwan
关键词
Associative classification; Direct marketing; Rare events; Class imbalance; Scoring; Probabilistic classifiers; PREDICTIVE ACCURACY;
D O I
10.1016/j.eswa.2012.05.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Having received considerable interest in recent years, associative classification has focused on developing a class classifier, with lesser attention paid to the probability classifier used in direct marketing. While contributing to this integrated framework, this work attempts to increase the prediction accuracy of associative classification on class imbalance by adapting the scoring based on associations (SBA) algorithm. The SBA algorithm is modified by coupling it with the pruning strategy of association rules in the probabilistic classification based on associations (PCBA) algorithm, which is adjusted from the CBA for use in the structure of the probability classifier. PCBA is adjusted from CBA by increasing the confidence through under-sampling, setting different minimum supports (minsups) and minimum confidences (minconfs) for rules of different classes based on each distribution, and removing the pruning rules of the lowest error rate. Experimental results based on benchmark datasets and real-life application datasets indicate that the proposed method performs better than C5.0 and the original SBA do, and the number of rules required for scoring is significantly reduced. (c) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:12841 / 12850
页数:10
相关论文
共 50 条
  • [41] Novel Mathematical Model of Breast Cancer Diagnostics Using an Associative Pattern Classification
    Santiago-Montero, Raul
    Sossa, Humberto
    Gutierrez-Hernandez, David A.
    Zamudio, Victor
    Hernandez-Bautista, Ignacio
    Valadez-Godinez, Sergio
    DIAGNOSTICS, 2020, 10 (03)
  • [42] A novel multi-objective genetic algorithm approach to address class imbalance for disease diagnosis
    Jain A.
    Ratnoo S.
    Kumar D.
    International Journal of Information Technology, 2023, 15 (2) : 1151 - 1166
  • [43] Verification of Effectiveness of a Probabilistic Algorithm for Latent Structure Extraction Using an Associative Memory Model
    Wakasugi, Kensuke
    Kuwatani, Tatsu
    Nagata, Kenji
    Asoh, Hideki
    Okada, Masato
    JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 2014, 83 (10)
  • [44] CIIR: an approach to handle class imbalance using a novel feature selection technique
    Thiyam, Bidyapati
    Dey, Shouvik
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (09) : 5355 - 5388
  • [45] Organic solar cells defects classification by using a new feature extraction algorithm and an EBNN with an innovative pruning algorithm
    Lo Sciuto, Grazia
    Capizzi, Giacomo
    Shikler, Rafi
    Napoli, Christian
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (06) : 2443 - 2464
  • [46] Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection
    Liu, Lijue
    Wu, Xiaoyu
    Li, Shihao
    Li, Yi
    Tan, Shiyang
    Bai, Yongping
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
  • [47] Full-class set classification using the Hungarian algorithm
    Kuncheva, Ludmila I.
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2010, 1 (1-4) : 53 - 61
  • [48] Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection
    Lijue Liu
    Xiaoyu Wu
    Shihao Li
    Yi Li
    Shiyang Tan
    Yongping Bai
    BMC Medical Informatics and Decision Making, 22
  • [49] Full-class set classification using the Hungarian algorithm
    Ludmila I. Kuncheva
    International Journal of Machine Learning and Cybernetics, 2010, 1 : 53 - 61
  • [50] Using significant, positively associated and relatively class correlated rules for associative classification of imbalanced datasets
    Verhein, Florian
    Chawla, Sanjay
    ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 679 - 684