Constraint Score Evaluation for Spectral Feature Selection

被引:3
作者
Kalakech, Mariam [1 ]
Biela, Philippe [2 ]
Hamad, Denis [3 ]
Macaire, Ludovic [4 ]
机构
[1] Univ Libanaise, Hadath, Lebanon
[2] HEI, F-59046 Lille, France
[3] ULCO, LISIC, F-62228 Calais, France
[4] Univ Lille 1, LAGIS UMR CNRS 8219, F-59655 Villeneuve Dascq, France
关键词
Feature selection; Spectral constraint scores; Pairwise constraints; Semi-supervised evaluation;
D O I
10.1007/s11063-013-9280-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised context characterized by the presence of a few pairs of constraints between learning samples is abundant in many real applications. Analysing these instance constraints by recent spectral scores has shown good performances for semi-supervised feature selection. The performance evaluation of these scores is generally based on classification accuracy and is performed in a ground truth context. However, this supervised context used by the evaluation step is inconsistent with the semi-supervised context in which the feature selection operates. In this paper, we propose a semi-supervised performance evaluation procedure, so that both feature selection and clustering steps take into account the constraints given by the user. In this way, the selection and the evaluation steps are performed in the same context which is close to real life applications. Extensive experiments on benchmark datasets are carried out in the last section. These experiments are performed using a supervised classical evaluation and the semi-supervised proposed one. They demonstrate the effectiveness of feature selection based on constraint analysis that uses both pairwise constraints and the information brought by the unlabeled data.
引用
收藏
页码:155 / 175
页数:21
相关论文
共 17 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], P SIAM INT C DAT MIN
[3]  
[Anonymous], 1998, FEATURE EXTRACTION C
[4]  
Basu S, 2009, CH CRC DATA MIN KNOW, P1
[5]  
Blake C. L., 1998, Uci repository of machine learning databases
[6]   SOLUTION OF THE ASSIGNMENT PROBLEM [H] [J].
CARPANETO, G ;
TOTH, P .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1980, 6 (01) :104-111
[7]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[8]  
He X., 2005, P 18 INT C NEURAL IN, V18, P507
[9]  
Kalakech M., 2010, P 3 INT C MACH VIS I, P275
[10]   Constraint scores for semi-supervised feature selection: A comparative study [J].
Kalakech, Mariam ;
Biela, Philippe ;
Macaire, Ludovic ;
Hamad, Denis .
PATTERN RECOGNITION LETTERS, 2011, 32 (05) :656-665