A New Path Based Hybrid Measure for Gene Ontology Similarity

被引:20
作者
Bandyopadhyay, Sanghamitra [1 ]
Mallick, Koushik [2 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, W Bengal, India
[2] RCC Inst Informat Technol, CSE Dept, Kolkata 700015, W Bengal, India
关键词
Gene ontology similarity; semantic similarity; term similarity; information content; protein interaction prediction; functional classification of genes; microRNA; SEMANTIC SIMILARITY; PROTEIN-INTERACTION; SACCHAROMYCES-CEREVISIAE; FUNCTIONAL SIMILARITY; R PACKAGE; DATABASE; GO; SEQUENCE; NETWORK; TOOLS;
D O I
10.1109/TCBB.2013.149
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Gene Ontology (GO) consists of a controlled vocabulary of terms, annotating a gene or gene product, structured in a directed acyclic graph. In the graph, semantic relations connect the terms, that represent the knowledge of functional description and cellular component information of gene products. GO similarity gives us a numerical representation of biological relationship between a gene set, which can be used to infer various biological facts such as protein interaction, structural similarity, gene clustering, etc. Here we introduce a new shortest path based hybrid measure of ontological similarity between two terms which combines both structure of the GO graph and information content of the terms. Here the similarity between two terms t(1) and t(2), referred to as GOSim(PBHM)(t(1), t(2)), has two components; one obtained from the common ancestors of t(1) and t(2). The other from their remaining ancestors. The proposed path based hybrid measure does not suffer from the well-known shallow annotation problem. Its superiority with respect to some other popular measures is established for protein protein interaction prediction, correlation with gene expression and functional classification of genes in a biological pathway. Finally, the proposed measure is utilized to compute the average GO similarity score among the genes that are experimentally validated targets of some microRNAs. Results demonstrate that the targets of a given miRNA have a high degree of similarity in the biological process category of GO.
引用
收藏
页码:116 / 127
页数:12
相关论文
共 43 条
[31]   Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language [J].
Resnik, P .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1999, 11 :95-130
[32]   A new measure for functional similarity of gene products based on Gene Ontology [J].
Schlicker, Andreas ;
Domingues, Francisco S. ;
Rahnenfuehrer, Joerg ;
Lengauer, Thomas .
BMC BIOINFORMATICS, 2006, 7 (1)
[33]  
Shen Y, 2010, IEEE INT C BIOINFORM, P533, DOI 10.1109/BIBM.2010.5706623
[34]   Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization [J].
Spellman, PT ;
Sherlock, G ;
Zhang, MQ ;
Iyer, VR ;
Anders, K ;
Eisen, MB ;
Brown, PO ;
Botstein, D ;
Futcher, B .
MOLECULAR BIOLOGY OF THE CELL, 1998, 9 (12) :3273-3297
[35]   A new method to measure the semantic similarity of GO terms [J].
Wang, James Z. ;
Du, Zhidian ;
Payattakool, Rapeeporn ;
Yu, Philip S. ;
Chen, Chin-Fu .
BIOINFORMATICS, 2007, 23 (10) :1274-1281
[36]   Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations [J].
Wu, Xiaomei ;
Zhu, Lei ;
Guo, Jie ;
Zhang, Da-Yong ;
Lin, Kui .
NUCLEIC ACIDS RESEARCH, 2006, 34 (07) :2137-2150
[37]   DIP, the Database of Interacting Proteins:: a research tool for studying cellular networks of protein interactions [J].
Xenarios, I ;
Salwínski, L ;
Duan, XQJ ;
Higney, P ;
Kim, SM ;
Eisenberg, D .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :303-305
[38]   Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data [J].
Xu, Tao ;
Du, LinFang ;
Zhou, Yan .
BMC BIOINFORMATICS, 2008, 9 (1)
[39]   clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters [J].
Yu, Guangchuang ;
Wang, Li-Gen ;
Han, Yanyan ;
He, Qing-Yu .
OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2012, 16 (05) :284-287
[40]   GOSemSim: an R package for measuring semantic similarity among GO terms and gene products [J].
Yu, Guangchuang ;
Li, Fei ;
Qin, Yide ;
Bo, Xiaochen ;
Wu, Yibo ;
Wang, Shengqi .
BIOINFORMATICS, 2010, 26 (07) :976-978