Evaluation of clustering algorithms for gene expression data using gene ontology annotations

被引:3
|
作者
Ma Ning [1 ]
Zhang Zheng-guo [1 ]
机构
[1] Chinese Acad Med Sci, Peking Union Med Coll, Inst Basic Med Sci, Dept Biomed Engn,Sch Basic Med, Beijing 100005, Peoples R China
关键词
microarray; gene expression; clustering; gene ontology; TOOL;
D O I
10.3760/cma.j.issn.0366-6999.2012.17.015
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Clustering is a useful exploratory technique for interpreting gene expression data to reveal groups of genes sharing common functional attributes. Biologists frequently face the problem of choosing an appropriate algorithm. We aimed to provide a standalone, easily accessible and biologically oriented criterion for expression data clustering evaluation. Methods An external criterion utilizing annotation based similarities between genes is proposed in this work. Gene ontology information is employed as the annotation source. Comparisons among six widely used clustering algorithms over various types of gene expression data sets were carried out based on the criterion proposed. Results The rank of these algorithms given by the criterion coincides with our common knowledge. Single-linkage has significantly poorer performance, even worse than the random algorithm. Ward's method archives the best performance in most cases. Conclusions The criterion proposed has a strong ability to distinguish among different clustering algorithms with different distance measurements. It is also demonstrated that analyzing main contributors of the criterion may offer some guidelines in finding local compact clusters. As an addition, we suggest using Ward's algorithm for gene expression data analysis. Chin Med J 2012;125(17):3048-3052
引用
收藏
页码:3048 / 3052
页数:5
相关论文
共 50 条
  • [31] On biological validity indices for soft clustering algorithms for gene expression data
    Wu, Han-Ming
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (05) : 1969 - 1979
  • [32] Benchmarking gene ontology function predictions using negative annotations
    Vesztrocy, Alex Warwick
    Dessimoz, Christophe
    BIOINFORMATICS, 2020, 36 : 210 - 218
  • [33] Assisted clustering of gene expression data using ANCut
    Sebastian J. Teran Hidalgo
    Mengyun Wu
    Shuangge Ma
    BMC Genomics, 18
  • [34] Gene Expression Data clustering using Unsupervised Methods
    Chandrasekhar, T.
    Thangavel, K.
    Elayaraja, E.
    2011 THIRD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2011, : 146 - 150
  • [35] Assisted clustering of gene expression data using ANCut
    Hidalgo, Sebastian J. Teran
    Wu, Mengyun
    Ma, Shuangge
    BMC GENOMICS, 2017, 18
  • [36] Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes
    Datta, Susmita
    Datta, Somnath
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [37] Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes
    Susmita Datta
    Somnath Datta
    BMC Bioinformatics, 7
  • [38] Bayesian Joint Analysis of Gene Expression Data and Gene Functional Annotations
    Wang X.
    Chen M.
    Khodursky A.B.
    Xiao G.
    Statistics in Biosciences, 2012, 4 (2) : 300 - 318
  • [39] Gene class expression: analysis tool of Gene Ontology terms with gene expression data
    Pereira, Gislaine S. P.
    Brandao, Rodrigo M.
    Giuliatti, Silvana
    Zago, Marco A.
    Silva, Wilson A., Jr.
    GENETICS AND MOLECULAR RESEARCH, 2006, 5 (01) : 108 - 114
  • [40] Gene Ontology annotations at SGD: new data sources and annotation methods
    Hong, Eurie L.
    Balakrishnan, Rama
    Dong, Qing
    Christie, Karen R.
    Park, Julie
    Binkley, Gail
    Costanzo, Maria C.
    Dwight, Selina S.
    Engel, Stacia R.
    Fisk, Dianna G.
    Hirschman, Jodi E.
    Hitz, Benjamin C.
    Krieger, Cynthia J.
    Livstone, Michael S.
    Miyasato, Stuart R.
    Nash, Robert S.
    Oughtred, Rose
    Skrzypek, Marek S.
    Weng, Shuai
    Wong, Edith D.
    Zhu, Kathy K.
    Dolinski, Kara
    Botstein, David
    Cherry, J. Michael
    NUCLEIC ACIDS RESEARCH, 2008, 36 : D577 - D581