Collaboration graph for feature set partitioning in data classification

被引:8
作者
Taheri, Khalil [1 ]
Moradi, Hadi [1 ,2 ]
Tavassolipour, Mostafa [1 ]
机构
[1] Univ Tehran, Coll Engn, Sch Elect & Comp Engn, Machine Intelligence & Robot Dept, Tehran 111554563, Iran
[2] SKKU, Intelligent Syst Res Inst ISRI, Suwon 16419, South Korea
关键词
Ensemble Classification; Features Collaboration Graph; Community Detection; AdaBoost Algorithm; FEATURE-SELECTION; MICROARRAY DATA; CANCER; TUMOR; CURSE; PATTERNS;
D O I
10.1016/j.eswa.2022.118988
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The curse of dimensionality of features in data classification is still an open issue. An approach to solve this problem is to partition features into several sub-sets of features hence the data classification task for every subset is performed. Then, an ensemble of these classifications are reported as the result of the classification problem. However, the feature set partitioning into sub-sets of features is still an area of research interest. Thus, in this paper, an innovative framework is proposed in which, first, a collaboration measure between each two features is defined and measured. Then, the collaboration graph, consisted of features as nodes and measured collaborations as edges' weights, is generated according to the collaboration measures calculated. After that, a community detection method is used to find the graph communities. The communities are considered as the feature subsets and a base classifier is trained for each subset based on the corresponding training data of the subsets. Then, the ensemble classifier is created by a combination of base classifiers according to the AdaBoost Aggreagation. The simulation results of the proposed approach over the real and synthetic datasets indicate that the proposed approach considerably increases the classification accuracy in comparison to previous methods.
引用
收藏
页数:10
相关论文
共 56 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]   A two-layer feature selection method using Genetic Algorithm and Elastic Net [J].
Amini, Fatemeh ;
Hu, Guiping .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 166
[3]   An improved method for voice pathology detection by means of a HMM-based feature space transformation [J].
Arias-Londono, Julian D. ;
Godino-Llorente, Juan I. ;
Saenz-Lechon, Nicolas ;
Osma-Ruiz, Victor ;
Castellanos-Dominguez, German .
PATTERN RECOGNITION, 2010, 43 (09) :3100-3112
[4]  
Bach F, 2017, J MACH LEARN RES, V18
[5]   Class-specific classifier: Avoiding the curse of dimensionality [J].
Baggenstoss, PM .
IEEE AEROSPACE AND ELECTRONIC SYSTEMS MAGAZINE, 2004, 19 (01) :37-52
[6]   Bagged support vector machines for emotion recognition from speech [J].
Bhavan, Anjali ;
Chauhan, Pankaj ;
Hitkul ;
Shah, Rajiv Ratn .
KNOWLEDGE-BASED SYSTEMS, 2019, 184
[7]   Feature grouping-based multiple fuzzy classifier system for fusion of hyperspectral and LIDAR data [J].
Bigdeli, Behnaz ;
Samadzadegan, Farhad ;
Reinartz, Peter .
JOURNAL OF APPLIED REMOTE SENSING, 2014, 8
[8]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[9]  
Bottou L., 2007, Large Scale Kernel Machines, P301
[10]   A novel feature grouping method for ensemble neural network using localized generalization error model [J].
Chan, Aki P. F. ;
Chan, Patrick P. K. ;
Ng, Wing W. Y. ;
Tsang, Eric C. C. ;
Yeung, Daniel S. .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2008, 22 (01) :137-151