Initial points selection for clustering gene expression data: A spatial contiguity analysis-based approach

被引：1

作者：

Yi, Hui ^{[1
,2
]}

Bo, Cuimei ^{[1
]}

Song, Xiaofeng ^{[2
]}

Yuan, Yuhao ^{[1
]}

机构：

[1] Nanjing Univ Technol, Coll Automat & Elect Engn, Nanjing 211816, Jiangsu, Peoples R China

[2] Nanjing Univ Aeronaut & Astronaut, Dept Biomed Engn, Nanjing 210016, Peoples R China

来源：

BIO-MEDICAL MATERIALS AND ENGINEERING | 2014年 / 24卷 / 06期

基金：

国家教育部博士点专项基金资助; 美国国家科学基金会;

关键词：

Gene expression data; k-means; initial points; spatial contiguity analysis; ALGORITHM;

D O I：

10.3233/BME-141199

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Clustering is considered one of the most powerful tools for analyzing gene expression data. Although clustering has been extensively studied, a problem remains significant: iterative techniques like k-means clustering are especially sensitive to initial starting conditions. An unreasonable selection of initial points leads to problems including local minima and massive computation. In this paper, a spatial contiguity analysis-based approach is proposed, aiming to solve this problem. It employs principal component analysis (PCA) to identify data points that are likely extracted from different clusters as initial points. This helps to avoid local minima, and accelerates the computation. The effectiveness of the proposed approach was validated on several benchmark datasets.

引用

页码：3709 / 3717

页数：9

共 50 条

[1] Spatial clustering based gene selection for gene expression analysis in microarray data classification
Dhas, P. Edwin
Lalitha, S.
Govindaraj, Annalakshmi
Jyoshna, B.
AUTOMATIKA, 2024, 65 (01) : 152 - 158
[2] A kernel-based clustering method for gene selection with gene expression data
Chen, Huihui
Zhang, Yusen
Gutman, Ivan
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 62 : 12 - 20
[3] Min max kurtosis distance based improved initial centroid selection approach of K-means clustering for big data mining on gene expression data
Pandey, Kamlesh Kumar
Shukla, Diwakar
EVOLVING SYSTEMS, 2023, 14 (02) : 207 - 244
[4] An effective fuzzy kernel clustering analysis approach for gene expression data
Sun, Lin
Xu, Jiucheng
Yin, Jiaojiao
BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S1863 - S1869
[5] Fuzzy entropy clustering by searching local border points for the analysis of gene expression data
Zeng, Yiping
Xu, Zeshui
He, Yue
Rao, Yu
KNOWLEDGE-BASED SYSTEMS, 2020, 190
[6] A model selection criterion for model-based clustering of annotated gene expression data
Gallopin, Melina
Celeux, Gilles
Jaffrezic, Florence
Rau, Andrea
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2015, 14 (05) : 413 - 428
[7] A robust fuzzy approach for gene expression data clustering
Jahan, Meskat
Hasan, Mahmudul
SOFT COMPUTING, 2021, 25 (23) : 14583 - 14596
[8] Min max kurtosis distance based improved initial centroid selection approach of K-means clustering for big data mining on gene expression data
Kamlesh Kumar Pandey
Diwakar Shukla
Evolving Systems, 2023, 14 : 207 - 244
[9] A robust approach based on Weibull distribution for clustering gene expression data
Huakun Wang
Zhenzhen Wang
Xia Li
Binsheng Gong
Lixin Feng
Ying Zhou
Algorithms for Molecular Biology, 6
[10] An Ensemble Approach for Gene Selection in Gene Expression Data
Castellanos-Garzon, Jose A.
Ramos, Juan
Lopez-Sanchez, Daniel
de Paz, Juan F.
11TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS, 2017, 616 : 237 - 247

← 1 2 3 4 5 →