Distance based feature selection for clustering microarray data

被引:0
作者
Dash, Manoranjan [1 ]
Gopalkrishnan, Vivekanand [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS | 2008年 / 4947卷
关键词
feature selection; clustering; distance function; microarray data;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In microarray data, clustering is the fundamental task for separating genes into biologically functional groups or for classifying tissues and phenotypes. Recently, with innovative gene expression microarray data technologies, thousands of expression levels of genes (features) can be measured simultaneously in a single experiment. The large number of genes with a lot of noise causes high complexity for cluster analysis. This challenge has raised the demand for feature selection - an effective dimensionality reduction technique that removes noisy features. In this paper we propose a novel filter method for feature selection. The suggested method, called ClosestFS, is based on a distance measure. For each feature, the distance is evaluated by computing its impact on the histogram for the whole data. Our experimental results show that the quality of clustering results (evaluated by several widely used measures) of K-means algorithm using ClosestFS as the pre-processing step is significantly better than that of the pure K-means.
引用
收藏
页码:512 / 519
页数:8
相关论文
共 50 条
  • [31] Effective feature selection framework for cluster analysis of microarray data
    Pok, Gouchol
    Liu, Jyh-Charn Steve
    Ryu, Keun Ho
    BIOINFORMATION, 2010, 4 (08) : 385 - 389
  • [32] A hybrid feature selection algorithm for microarray data
    Yuefeng Zheng
    Ying Li
    Gang Wang
    Yupeng Chen
    Qian Xu
    Jiahao Fan
    Xueting Cui
    The Journal of Supercomputing, 2020, 76 : 3494 - 3526
  • [33] MaskedPainter: Feature selection for microarray data analysis
    Apiletti, Daniele
    Baralis, Elena
    Bruno, Giulia
    Fiori, Alessandro
    INTELLIGENT DATA ANALYSIS, 2012, 16 (04) : 717 - 737
  • [34] A hybrid feature selection algorithm for microarray data
    Zheng, Yuefeng
    Li, Ying
    Wang, Gang
    Chen, Yupeng
    Xu, Qian
    Fan, Jiahao
    Cui, Xueting
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (05) : 3494 - 3526
  • [35] Linear regression-based feature selection for microarray data classification
    Hasan, Md Abid
    Hasan, Md Kamrul
    Mottalib, M. Abdul
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 11 (02) : 167 - 179
  • [36] Hybrid feature selection approach based on GRASP for cancer microarray data
    Nagpal A.
    Gaur D.
    Journal of Computing and Information Technology, 2017, 25 (02) : 133 - 148
  • [37] A clustering-based feature selection via feature separability
    Jiang, Shengyi
    Wang, Lianxi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 927 - 937
  • [38] Curious Feature Selection-Based Clustering
    Moran M.
    Gordon G.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (12): : 6146 - 6158
  • [39] Feature selection using feature dissimilarity measure and density-based clustering: Application to biological data
    Sengupta, Debarka
    Aich, Indranil
    Bandyopadhyay, Sanghamitra
    JOURNAL OF BIOSCIENCES, 2015, 40 (04) : 721 - 730
  • [40] Feature Selection for Density-Based Clustering
    Ling, Yun
    Ye, Chongyi
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT UBIQUITOUS COMPUTING AND EDUCATION, 2009, : 226 - 229