A New Family of Similarity Measures for Scoring Confidence of Protein Interactions Using Gene Ontology

被引:4
作者
Paul, Madhusudan [1 ]
Anand, Ashish [2 ]
机构
[1] Visva Bharati, Dept Comp & Syst Sci, Santini Ketan 731235, W Bengal, India
[2] Indian Inst Technol Guwahati, Dept Comp Sci & Engn, Gauhati 781039, Assam, India
关键词
Protein-protein interaction; semantic similarity measures; gene ontology; specificity; information content; set-discriminating power; KEGG pathways; ROC curve; Pfam; MEASURING SEMANTIC SIMILARITY; COMPARATIVE GENOME ANALYSIS; INFORMATION-CONTENT; INTERACTION NETWORK; GO TERMS; PREDICTION; DATABASE; EXPRESSION; INFERENCE; FEATURES;
D O I
10.1109/TCBB.2021.3083150
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The large-scale protein-protein interaction (PPI) data has the potential to play a significant role in the endeavor of understanding cellular processes. However, the presence of a considerable fraction of false positives is a bottleneck in realizing this potential. There have been continuous efforts to utilize complementary resources for scoring confidence of PPIs in a manner that false positive interactions get a low confidence score. Gene Ontology (GO), a taxonomy of biological terms to represent the properties of gene products and their relations, has been widely used for this purpose. We utilize GO to introduce a new set of specificity measures: Relative Depth Specificity (RDS), Relative Node-based Specificity (RNS), and Relative Edge-based Specificity (RES), leading to a new family of similarity measures. We use these similarity measures to obtain a confidence score for each PPI. We evaluate the new measures using four different benchmarks. We show that all the three measures are quite effective. Notably, RNS and RES more effectively distinguish true PPIs from false positives than the existing alternatives. RES also shows a robust set-discriminating power and can be useful for protein functional clustering as well.
引用
收藏
页码:19 / 30
页数:12
相关论文
共 91 条
[1]  
Adhikari A, 2015, TENCON IEEE REGION
[2]   The IntAct molecular interaction database in 2010 [J].
Aranda, B. ;
Achuthan, P. ;
Alam-Faruque, Y. ;
Armean, I. ;
Bridge, A. ;
Derow, C. ;
Feuermann, M. ;
Ghanbarian, A. T. ;
Kerrien, S. ;
Khadake, J. ;
Kerssemakers, J. ;
Leroy, C. ;
Menden, M. ;
Michaut, M. ;
Montecchi-Palazzi, L. ;
Neuhauser, S. N. ;
Orchard, S. ;
Perreau, V. ;
Roechert, B. ;
van Eijk, K. ;
Hermjakob, H. .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D525-D531
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]  
Azuaje F., 2005, P ISMB SIG M BIOONTO, P9
[5]   BIND - The Biomolecular Interaction Network Database [J].
Bader, GD ;
Donaldson, I ;
Wolting, C ;
Ouellette, BFF ;
Pawson, T ;
Hogue, CWV .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :242-245
[6]   A New Path Based Hybrid Measure for Gene Ontology Similarity [J].
Bandyopadhyay, Sanghamitra ;
Mallick, Koushik .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (01) :116-127
[7]   IntelliGO: a new vector-based semantic similarity measure including annotation origin [J].
Benabderrahmane, Sidahmed ;
Smail-Tabbone, Malika ;
Poch, Olivier ;
Napoli, Amedeo ;
Devignes, Marie-Dominique .
BMC BIOINFORMATICS, 2010, 11
[8]  
Carey V, 2015, R PACKAGE
[9]  
Carlson Marc., 2019, GODB SET ANNOTATION
[10]   MINT, the molecular interaction database: 2009 update [J].
Ceol, Arnaud ;
Aryamontri, Andrew Chatr ;
Licata, Luana ;
Peluso, Daniele ;
Briganti, Leonardo ;
Perfetto, Livia ;
Castagnoli, Luisa ;
Cesareni, Gianni .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D532-D539