Initial points selection for clustering gene expression data: A spatial contiguity analysis-based approach

被引:1
|
作者
Yi, Hui [1 ,2 ]
Bo, Cuimei [1 ]
Song, Xiaofeng [2 ]
Yuan, Yuhao [1 ]
机构
[1] Nanjing Univ Technol, Coll Automat & Elect Engn, Nanjing 211816, Jiangsu, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Dept Biomed Engn, Nanjing 210016, Peoples R China
基金
国家教育部博士点专项基金资助; 美国国家科学基金会;
关键词
Gene expression data; k-means; initial points; spatial contiguity analysis; ALGORITHM;
D O I
10.3233/BME-141199
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Clustering is considered one of the most powerful tools for analyzing gene expression data. Although clustering has been extensively studied, a problem remains significant: iterative techniques like k-means clustering are especially sensitive to initial starting conditions. An unreasonable selection of initial points leads to problems including local minima and massive computation. In this paper, a spatial contiguity analysis-based approach is proposed, aiming to solve this problem. It employs principal component analysis (PCA) to identify data points that are likely extracted from different clusters as initial points. This helps to avoid local minima, and accelerates the computation. The effectiveness of the proposed approach was validated on several benchmark datasets.
引用
收藏
页码:3709 / 3717
页数:9
相关论文
共 50 条
  • [1] Spatial clustering based gene selection for gene expression analysis in microarray data classification
    Dhas, P. Edwin
    Lalitha, S.
    Govindaraj, Annalakshmi
    Jyoshna, B.
    AUTOMATIKA, 2024, 65 (01) : 152 - 158
  • [2] A kernel-based clustering method for gene selection with gene expression data
    Chen, Huihui
    Zhang, Yusen
    Gutman, Ivan
    JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 62 : 12 - 20
  • [3] Min max kurtosis distance based improved initial centroid selection approach of K-means clustering for big data mining on gene expression data
    Pandey, Kamlesh Kumar
    Shukla, Diwakar
    EVOLVING SYSTEMS, 2023, 14 (02) : 207 - 244
  • [4] An effective fuzzy kernel clustering analysis approach for gene expression data
    Sun, Lin
    Xu, Jiucheng
    Yin, Jiaojiao
    BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S1863 - S1869
  • [5] Fuzzy entropy clustering by searching local border points for the analysis of gene expression data
    Zeng, Yiping
    Xu, Zeshui
    He, Yue
    Rao, Yu
    KNOWLEDGE-BASED SYSTEMS, 2020, 190
  • [6] A model selection criterion for model-based clustering of annotated gene expression data
    Gallopin, Melina
    Celeux, Gilles
    Jaffrezic, Florence
    Rau, Andrea
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2015, 14 (05) : 413 - 428
  • [7] A robust fuzzy approach for gene expression data clustering
    Jahan, Meskat
    Hasan, Mahmudul
    SOFT COMPUTING, 2021, 25 (23) : 14583 - 14596
  • [8] Min max kurtosis distance based improved initial centroid selection approach of K-means clustering for big data mining on gene expression data
    Kamlesh Kumar Pandey
    Diwakar Shukla
    Evolving Systems, 2023, 14 : 207 - 244
  • [9] A robust approach based on Weibull distribution for clustering gene expression data
    Huakun Wang
    Zhenzhen Wang
    Xia Li
    Binsheng Gong
    Lixin Feng
    Ying Zhou
    Algorithms for Molecular Biology, 6
  • [10] An Ensemble Approach for Gene Selection in Gene Expression Data
    Castellanos-Garzon, Jose A.
    Ramos, Juan
    Lopez-Sanchez, Daniel
    de Paz, Juan F.
    11TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS, 2017, 616 : 237 - 247