Class imbalance learning using fuzzy ART and intuitionistic fuzzy twin support vector machines

被引:39
作者
Rezvani, Salim [1 ]
Wang, Xizhao [1 ]
机构
[1] Shenzhen Univ, Big Data Inst, Coll Comp Sci & Software Engn, Guangdong Key Lab Intelligent Informat Proc, Shenzhen 518060, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Class imbalance learning; Coordinate descent; Intuitionistic fuzzy number; Fuzzy ART; Twin support vector machine; CLASSIFICATION; PERFORMANCE; ALGORITHM; SMOTE;
D O I
10.1016/j.ins.2021.07.010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The classification in imbalanced datasets is one of the main problems for machine learning techniques. Support vector machine (SVM) is biased to the majority class samples, and the minority class samples may incorrectly be considered as noise. Therefore, SVM has poor predictive accuracy for imbalanced datasets and generates inaccurate classification mod-els. Existing class imbalance learning (CIL) techniques can make SVM less sensitive to class imbalance, but these methods suffer from issues related to noise and outliers. Moreover, despite the solid theoretical basis and good classification performance, SVM is not appro-priate for the classification of large-scale datasets because the training complexity of SVM is closely related to the dataset size. Class imbalance learning (CIL) using Fuzzy adaptive resonance theory (ART) and intuitionistic fuzzy twin SVM (CIL-FART-IFTSVM), which can be applied to address the class imbalance issue in the presence of noise and outliers and large scale datasets, is proposed to overcome these substantial difficulties. In this method, we modify the distribution of the datasets using fuzzy adaptive resonance theory (Fuzzy ART) as a clustering method to overcome the imbalance problem. Then, after data reduc-tion, IFTSVM is utilized to find excellent non-parallel hyperplanes in the generated data points. Finally, a coordinate descent system with shrinking by an active set is applied to reduce the computational complexity. Forty-five imbalanced datasets are considered to validate the performance of the proposed CIL-FART-IFTSVM method. The Friedman test and the bootstrap technique with 95% confidence intervals are applied to quantify the results statistically. The experimental results indicate that the method proposed in this paper has a better performance compared with other methods, and the training time is sig-nificantly better than that of other classifiers for large-scale datasets. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:659 / 682
页数:24
相关论文
共 69 条
[1]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[2]  
[Anonymous], 2000, P INT C MACH LEARN
[3]   INTUITIONISTIC FUZZY-SETS [J].
ATANASSOV, KT .
FUZZY SETS AND SYSTEMS, 1986, 20 (01) :87-96
[4]  
Awad M, 2004, PROC INT C TOOLS ART, P663
[5]  
Batista G.E.A.P.A., 2004, ACM SIGKDD Explor. Newsl, V6, P20, DOI [DOI 10.1145/1007730.1007735, 10.1145/1007730.1007735]
[6]   FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning [J].
Batuwita, Rukshan ;
Palade, Vasile .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (03) :558-571
[7]   Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets [J].
Bermejo, Pablo ;
Gamez, Jose A. ;
Puerta, Jose M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (03) :2072-2080
[8]   Expediting the Accuracy-Improving Process of SVMs for Class Imbalance Learning [J].
Cao, Bin ;
Liu, Yuqi ;
Hou, Chenyu ;
Fan, Jing ;
Zheng, Baihua ;
Yin, Jianwei .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (11) :3550-3567
[9]   FUZZY ART - FAST STABLE LEARNING AND CATEGORIZATION OF ANALOG PATTERNS BY AN ADAPTIVE RESONANCE SYSTEM [J].
CARPENTER, GA ;
GROSSBERG, S ;
ROSEN, DB .
NEURAL NETWORKS, 1991, 4 (06) :759-771
[10]  
Chang KW, 2008, J MACH LEARN RES, V9, P1369