Feature selection via Boolean independent component analysis

被引:9
作者
Apolioni, Bruno [1 ]
Bassis, Simone [1 ]
Brega, Andrea [2 ]
机构
[1] Univ Milan, Dipartimento Sci Informaz, I-20135 Milan, Italy
[2] Univ Milan, Dipartimento Matemat F Enriques, I-20133 Milan, Italy
关键词
Feature selection; Feature extraction; Boolean independent component analysis; Clustering; Classification; DNA microarray; SVM ensemble; GENE SELECTION; EXPRESSION DATA; CLASSIFICATION; CANCER; ALGORITHMS; PREDICTION; TISSUE;
D O I
10.1016/j.ins.2009.07.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We devise a feature selection method in terms of a follow-out utility of a special classification procedure. In turn, we root the latter on binary features which we extract from the input patterns with a wrapper method. The whole contrivance results in a procedure that is progressive in two respects. As for features, first we compute a very essential representation of them in terms of Boolean independent components in order to reduce their entropy. Then we reverse the representation mapping to discover the subset of the original features supporting a successful classification. As for the classification, we split it into two less hard tasks. With the former we look for a clustering of input patterns that satisfies loose consistency constraints and benefits from the conciseness of binary representation. With the latter we attribute labels to the clusters through the combined use of basically linear separators. We implement out the method through a relatively quick numerical procedure by assembling a set of connectionist and symbolic routines. These we toss on the benchmark of feature selection of DNA microarray data in cancer diagnosis and other ancillary datasets. (c) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:3815 / 3831
页数:17
相关论文
共 80 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]   Neuroendocrine cells in prostate cancer [J].
Amorino, GP ;
Parsons, SJ .
CRITICAL REVIEWS IN EUKARYOTIC GENE EXPRESSION, 2004, 14 (04) :287-300
[3]  
[Anonymous], 1987, LEARNING INTERNAL RE
[4]  
[Anonymous], Journal of machine learning research
[5]  
[Anonymous], 1993, C4.5: Programs for machine learning
[6]  
Apolloni B, 2008, LECT NOTES COMPUT SC, V5163, P99, DOI 10.1007/978-3-540-87536-9_11
[7]  
Apolloni B, 2005, HIS 2005: 5th International Conference on Hybrid Intelligent Systems, Proceedings, P131
[8]   A general framework for learning rules from data [J].
Apolloni, B ;
Esposito, A ;
Malchiodi, D ;
Orovas, C ;
Palmas, G ;
Taylor, JG .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2004, 15 (06) :1333-1349
[9]  
Apolloni B, 2004, LECT NOTES ARTIF INT, V3202, P528
[10]  
Apolloni B., 2002, Abstraction, Reformulation, and Approximation. 5th International Symposium, SARA 2002. Proceedings (Lecture Notes in Artificial Intelligence Vol.2371), P274