Constraint scores for semi-supervised feature selection: A comparative study

被引:56
作者
Kalakech, Mariam [1 ,2 ]
Biela, Philippe [1 ,2 ]
Macaire, Ludovic [1 ]
Hamad, Denis [3 ]
机构
[1] Univ Lille 1, LAGIS FRE CNRS 3303, F-59655 Villeneuve Dascq, France
[2] HEI, F-59046 Lille, France
[3] ULCO, LISIC, F-62228 Calais, France
关键词
Feature selection; Pairwise constraints; Kendall's coefficient; Constraint scores; Laplacian score; Fisher score;
D O I
10.1016/j.patrec.2010.12.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent feature selection scores using pairwise constraints (must-link and cannot-link) have shown better performances than the unsupervised methods and comparable to the supervised ones. However, these scores use only the pairwise constraints and ignore the available information brought by the unlabeled data. Moreover, these constraint scores strongly depend on the given must-link and cannot-link subsets built by the user. In this paper, we address these problems and propose a new semi-supervised constraint score that uses both pairwise constraints and local properties of the unlabeled data. Experiments using Kendall's coefficient and accuracy rates, show that this new score is less sensitive to the given constraints than the previous scores while providing similar performances. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:656 / 665
页数:10
相关论文
共 22 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], 2005, Advances in Neural Information Processing Systems
[3]  
[Anonymous], P 10 EUR C PRINC PRA
[4]  
[Anonymous], 2005, NEURAL NETWORKS PATT
[5]  
Blake C. L., 1998, Uci repository of machine learning databases
[6]  
Dy JG, 2004, J MACH LEARN RES, V5, P845
[7]   Research on collaborative negotiation for e-commerce. [J].
Feng, YQ ;
Lei, Y ;
Li, Y ;
Cao, RZ .
2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, :2085-2088
[8]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[9]   The coefficient of concordance for vague data [J].
Grzegorzewski, Przemyslaw .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (01) :314-322
[10]  
He X., 2005, P 18 INT C NEURAL IN, V18, P507