Distance based feature selection for clustering microarray data

被引:0
作者
Dash, Manoranjan [1 ]
Gopalkrishnan, Vivekanand [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS | 2008年 / 4947卷
关键词
feature selection; clustering; distance function; microarray data;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In microarray data, clustering is the fundamental task for separating genes into biologically functional groups or for classifying tissues and phenotypes. Recently, with innovative gene expression microarray data technologies, thousands of expression levels of genes (features) can be measured simultaneously in a single experiment. The large number of genes with a lot of noise causes high complexity for cluster analysis. This challenge has raised the demand for feature selection - an effective dimensionality reduction technique that removes noisy features. In this paper we propose a novel filter method for feature selection. The suggested method, called ClosestFS, is based on a distance measure. For each feature, the distance is evaluated by computing its impact on the histogram for the whole data. Our experimental results show that the quality of clustering results (evaluated by several widely used measures) of K-means algorithm using ClosestFS as the pre-processing step is significantly better than that of the pure K-means.
引用
收藏
页码:512 / 519
页数:8
相关论文
共 50 条
  • [21] FEATURE SELECTION FOR MICROARRAY DATA USING PROBABILITY DISTANCES
    Korenblat, K.
    Volkovich, Z.
    JP JOURNAL OF BIOSTATISTICS, 2012, 7 (01) : 15 - 34
  • [22] Feature Selection-Based Clustering on Micro-blogging Data
    Dutta, Soumi
    Ghatak, Sujata
    Das, Asit Kumar
    Gupta, Manan
    Dasgupta, Sayantika
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, 2019, 711 : 885 - 895
  • [23] Robust Feature Selection for Microarray Data Based on Multicriterion Fusion
    Yang, Feng
    Mao, K. Z.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (04) : 1080 - 1092
  • [24] Comparative study of feature selection methods on microarray data
    Miyamoto, T
    Uchimura, S
    Hamamoto, Y
    Iizuka, N
    Oka, M
    Yamada-Okabe, H
    IEEE EMBS APBME 2003, 2003, : 82 - 83
  • [25] A Feature Selection Framework Based on Supervised Data Clustering
    Liu, Hongzhi
    Fu, Bin
    Jiang, Zhengshen
    Wu, Zhonghai
    Hsu, D. Frank
    2016 IEEE 15TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2016, : 316 - 321
  • [26] Feature selection based on partition clustering
    Liu, Shuang
    Zhao, Qiang
    Wu, Xiang
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2014, 18 (02) : 135 - 142
  • [27] Multimodal feature selection from microarray data based on Dempster–Shafer evidence fusion
    Nadia Nekouie
    Morteza Romoozi
    Mahdi Esmaeili
    The Journal of Supercomputing, 2023, 79 : 12591 - 12621
  • [28] Clustering Mining Method of College Students' Employment Data Based on Feature Selection
    Qi, Mei-bin
    ADVANCED HYBRID INFORMATION PROCESSING, PT I, 2022, 416 : 105 - 115
  • [29] A GA-based Feature Selection for High-dimensional Data Clustering
    Sun, Mei
    Xiong, Langhuan
    Sun, Haojun
    Jiang, Dazhi
    THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING, 2009, : 769 - 772
  • [30] Comparison and Evaluation of the Combinations of Feature Selection and Classifier on Microarray Data
    Yan, Chaokun
    Zhang, Jun
    Kang, Xi
    Gong, Zhengze
    Wang, Jianlin
    Zhang, Ge
    2021 IEEE 6TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2021), 2021, : 133 - 137