Graph-Based Semi-supervised Feature Selection with Application to Automatic Spam Image Identification

被引:0
作者
Cheng, Hongrong [1 ]
Deng, Wei [1 ]
Fu, Chong [1 ]
Wang, Yong [1 ]
Qin, Zhiguang [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
来源
COMPUTER SCIENCE FOR ENVIRONMENTAL ENGINEERING AND ECOINFORMATICS, PT 2 | 2011年 / 159卷
关键词
Semi-supervised Feature Selection; Laplacian Score; Conditional Mutual Information; Spam Image Identification;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we propose a new spectral semi-supervised feature selection criterion called s-Laplacian score. It identifies discriminate features by measuring their capability of preserving both local and global geometrical structure. To address the limitation for spectral feature selection which cannot handle redundant features, we define Classification Information Gain degree (CIG) to measure redundant features. Based on s-Laplacian and CIG, we propose a graph-based semi-supervised feature selection algorithm (GSFS). The experimental results on real-world image dataset for automatic spam image identification problem show that GSFS can do well in utilizing small labeled samples and a large amount unlabeled data to select discriminate features.
引用
收藏
页码:259 / 264
页数:6
相关论文
共 11 条
[1]  
[Anonymous], 2005, 1530 U WISC
[2]  
[Anonymous], 2000, Pattern Classification
[3]   Conditional Mutual Information-Based Feature Selection Analyzing for Synergy and Redundancy [J].
Cheng, Hongrong ;
Qin, Zhiguang ;
Feng, Chaosheng ;
Wang, Yong ;
Li, Fagen .
ETRI JOURNAL, 2011, 33 (02) :210-218
[4]  
Dredze M., 2007, 4 C EM ANT CAL
[5]   Relevant and Redundant Feature Analysis with Ensemble Classification [J].
Duangsoithong, Rakkrit ;
Windeatt, Terry .
ICAPR 2009: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, PROCEEDINGS, 2009, :247-250
[6]  
He X., 2005, P 18 INT C NEURAL IN, V18, P507
[7]  
Hongrong Cheng, 2008, 2008 IEEE Conference on Cybernetics and Intelligent Systems, P1017, DOI 10.1109/ICCIS.2008.4670821
[8]   Clustering-based Feature Selection in Semi-supervised Problems [J].
Quinzan, Ianisse ;
Sotoca, Jose M. ;
Pla, Filiberto .
2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, :535-540
[9]  
Witten I. H., 2005, DATA MINING, V2, P403
[10]  
Yeung DS, 2009, PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, P399, DOI 10.1109/ICMLC.2009.5212468