Markov Blanket Feature Selection Using Representative Sets

被引:22
作者
Yu, Kui [1 ]
Wu, Xindong [2 ]
Ding, Wei [3 ]
Mu, Yang [3 ]
Wang, Hao [4 ]
机构
[1] Univ South Australia, Sch Informat Technol & Math Sci, Adelaide, SA 5095, Australia
[2] Univ Louisiana, Sch Comp & Informat, Lafayette, LA 70504 USA
[3] Univ Massachusetts, Dept Comp Sci, Boston, MA 02125 USA
[4] Hefei Univ Technol, Dept Comp Sci, Hefei 230009, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Bayesian networks; feature selection; Markov blankets; representative sets; FEATURE SUBSET; CLASSIFICATION; ALGORITHMS; INFORMATION; BOUNDARIES; REDUNDANCY; DISCOVERY; RELEVANCE;
D O I
10.1109/TNNLS.2016.2602365
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It has received much attention in recent years to use Markov blankets in a Bayesian network for feature selection. The Markov blanket of a class attribute in a Bayesian network is a unique yet minimal feature subset for optimal feature selection if the probability distribution of a data set can be faithfully represented by this Bayesian network. However, if a data set violates the faithful condition, Markov blankets of a class attribute may not be unique. To tackle this issue, in this paper, we propose a new concept of representative sets and then design the selection via group alpha-investing (SGAI) algorithm to perform Markov blanket feature selection with representative sets for classification. Using a comprehensive set of real data, our empirical studies have demonstrated that SGAI outperforms the state-of-the-art Markov blanket feature selectors and other well-established feature selection methods.
引用
收藏
页码:2775 / 2788
页数:14
相关论文
共 33 条
[1]   Soft-constrained Laplacian score for semi-supervised multi-label feature selection [J].
Alalga, Abdelouahid ;
Benabdeslem, Khalid ;
Taleb, Nora .
KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 47 (01) :75-98
[2]  
Aliferis CF, 2010, J MACH LEARN RES, V11, P171
[3]  
[Anonymous], 1993, C4 5 PROGRAMS MACHIN
[4]  
[Anonymous], 2010, P 13 INT C ARTIFICIA
[5]  
[Anonymous], 2008, P 14 ACM SIGKDD INT
[6]  
Brown G, 2012, J MACH LEARN RES, V13, P27
[7]   Feature Selection Using a Neural Framework With Controlled Redundancy [J].
Chakraborty, Rudrasis ;
Pal, Nikhil R. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) :35-50
[8]   The Fisher-Markov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to High-Dimensional Data [J].
Cheng, Qiang ;
Zhou, Hongbo ;
Cheng, Jie .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (06) :1217-1233
[9]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[10]   Greedy column subset selection for large-scale data sets [J].
Farahat, Ahmed K. ;
Elgohary, Ahmed ;
Ghodsi, Ali ;
Kamel, Mohamed S. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (01) :1-34