Initial points selection for clustering gene expression data: A spatial contiguity analysis-based approach

被引:1
|
作者
Yi, Hui [1 ,2 ]
Bo, Cuimei [1 ]
Song, Xiaofeng [2 ]
Yuan, Yuhao [1 ]
机构
[1] Nanjing Univ Technol, Coll Automat & Elect Engn, Nanjing 211816, Jiangsu, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Dept Biomed Engn, Nanjing 210016, Peoples R China
基金
国家教育部博士点专项基金资助; 美国国家科学基金会;
关键词
Gene expression data; k-means; initial points; spatial contiguity analysis; ALGORITHM;
D O I
10.3233/BME-141199
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Clustering is considered one of the most powerful tools for analyzing gene expression data. Although clustering has been extensively studied, a problem remains significant: iterative techniques like k-means clustering are especially sensitive to initial starting conditions. An unreasonable selection of initial points leads to problems including local minima and massive computation. In this paper, a spatial contiguity analysis-based approach is proposed, aiming to solve this problem. It employs principal component analysis (PCA) to identify data points that are likely extracted from different clusters as initial points. This helps to avoid local minima, and accelerates the computation. The effectiveness of the proposed approach was validated on several benchmark datasets.
引用
收藏
页码:3709 / 3717
页数:9
相关论文
共 50 条
  • [21] Clustering of gene expression data: performance and similarity analysis
    Longde Yin
    Chun-Hsi Huang
    Jun Ni
    BMC Bioinformatics, 7
  • [22] Discriminant analysis to evaluate clustering of gene expression data
    Méndez, MA
    Hödar, C
    Vulpe, C
    González, M
    Cambiazo, V
    FEBS LETTERS, 2002, 522 (1-3) : 24 - 28
  • [23] Social network Analysis-based classifier (SNAc): A case study on time course gene expression data
    Ucer, Serkan
    Kocak, Yunuscan
    Ozyer, Tansel
    Alhajj, Reda
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2017, 150 : 73 - 84
  • [24] An Effective Method Determining the Initial Cluster Centers for K-means for Clustering Gene Expression Data
    Tanir, Deniz
    Nuriyeva, Fidan
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 751 - 754
  • [25] Performance Enhancement of K-Means clustering algorithm for gene expression data using entropy-based centroid selection
    Trivedi, Naveen
    Kanungo, Suvendu
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 143 - 148
  • [26] A hybrid spatial data clustering method for site selection: The data driven approach of GIS mining
    Fan, Bo
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 3923 - 3936
  • [27] Clustering-based hybrid feature selection approach for high dimensional microarray data
    Babu, Samson Anosh P.
    Annavarapu, Chandra Sekhara Rao
    Dara, Suresh
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 213
  • [28] Wavelet transform-based denoise and fuzzy clustering analysis for gene expression data
    Cui, Guangzhao
    Cao, Xianghong
    Zhou, Lili
    Cao, Lingzhi
    Huang, Buyi
    Yang, Cunxiang
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2007, 14 : 35 - 39
  • [29] Gene expression data clustering using a multiobjective symmetry based clustering technique
    Saha, Sriparna
    Ekbal, Asif
    Gupta, Kshitija
    Bandyopadhyay, Sanghamitra
    COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (11) : 1965 - 1977
  • [30] Density-Based Spatial Clustering and Ordering Points Approach for Characterizations of Tourist Behaviour
    Rodriguez-Echeverria, Jorge
    Semanjski, Ivana
    Van Gheluwe, Casper
    Ochoa, Daniel
    IJben, Harm
    Gautama, Sidharta
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2020, 9 (11)