Improving Fusion of Dimensionality Reduction Methods for Nearest Neighbor Classification

被引:2
作者
Deegalla, Sampath [1 ]
Bostrom, Henrik [1 ]
机构
[1] Stockholm Univ, Dept Comp & Syst Sci, SE-16440 Kista, Sweden
来源
EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS | 2009年
关键词
nearest neighbor classification; dimensionality reduction; feature fusion; classifier fusion; microarrays; CANCER; TUMOR; PREDICTION; PATTERNS;
D O I
10.1109/ICMLA.2009.95
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In previous studies, performance improvement of nearest neighbor classification of high dimensional data, such as microarrays, has been investigated using dimensionality reduction. It has been demonstrated that the fusion of dimensionality reduction methods, either by fusing classifiers obtained from each set of reduced features, or by fusing all reduced features are better than than using any single dimensionality reduction method. However, none of the fusion methods consistently outperform the use of a single dimensionality reduction method. Therefore, a new way of fusing features and classifiers is proposed, which is based on searching for the optimal number of dimensions for each considered dimensionality reduction method. An empirical evaluation on microarray classification is presented, comparing classifier and feature fusion with and without the proposed method, in conjunction with three dimensionality reduction methods; Principal Component Analysis (PCA), Partial Least Squares (PLS) and Information Gain (IG). The new classifier fusion method outperforms the previous in 4 out of 8 cases, and is on par with the best single dimensionality reduction method. The novel feature fusion method is however outperformed by the previous method, which selects the same number of features from each dimensionality reduction method. Hence, it is concluded that the idea of optimizing the number of features separately for each dimensionality reduction method can only be recommended for classifier fusion.
引用
收藏
页码:771 / 775
页数:5
相关论文
共 25 条
  • [1] Abdi H., 2003, Encyclopedia of Social Sciences Research Methods
  • [2] INSTANCE-BASED LEARNING ALGORITHMS
    AHA, DW
    KIBLER, D
    ALBERT, MK
    [J]. MACHINE LEARNING, 1991, 6 (01) : 37 - 66
  • [3] Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
    Alizadeh, AA
    Eisen, MB
    Davis, RE
    Ma, C
    Lossos, IS
    Rosenwald, A
    Boldrick, JG
    Sabet, H
    Tran, T
    Yu, X
    Powell, JI
    Yang, LM
    Marti, GE
    Moore, T
    Hudson, J
    Lu, LS
    Lewis, DB
    Tibshirani, R
    Sherlock, G
    Chan, WC
    Greiner, TC
    Weisenburger, DD
    Armitage, JO
    Warnke, R
    Levy, R
    Wilson, W
    Grever, MR
    Byrd, JC
    Botstein, D
    Brown, PO
    Staudt, LM
    [J]. NATURE, 2000, 403 (6769) : 503 - 511
  • [4] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [5] [Anonymous], KENT RIDGE BIOMEDICA
  • [6] [Anonymous], P 10 INT C INF FUS
  • [7] [Anonymous], J MACHINE LEARNING R
  • [8] [Anonymous], CHEMOMETRICS INTELLI
  • [9] [Anonymous], NAT MED
  • [10] [Anonymous], P 12 INT C INF FUS