Protein function prediction from dynamic protein interaction network using gene expression data

被引:8
作者
Saha, Sovan [1 ]
Prasad, Abhimanyu [1 ]
Chatterjee, Piyali [2 ]
Basu, Subhadip [3 ]
Nasipuri, Mita [2 ]
机构
[1] Dr Sudhir Chandra Degree Engn Coll, Dept Comp Sci & Engn, 540 Dum Dum Rd,Near Dum Dum Jn Stn, Kolkata 700074, India
[2] Netaji Subhash Engn Coll, Dept Comp Sci & Engn, Kolkata 700152, India
[3] Jadavpur Univ, Dept Comp Sci & Engn, 188 Raja SC Mallick Rd, Kolkata 700032, India
关键词
Protein function prediction; dynamic protein interaction network; gene expression data; protein-protein interaction network; COMPLEXES; SEQUENCE;
D O I
10.1142/S0219720019500252
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Computational prediction of functional annotation of proteins is an uphill task. There is an ever increasing gap between functional characterization of protein sequences and deluge of protein sequences generated by large-scale sequencing projects. The dynamic nature of protein interactions is frequently observed which is mostly influenced by any new change of state or change in stimuli. Functional characterization of proteins can be inferred from their interactions with each other, which is dynamic in nature. In this work, we have used a dynamic protein-protein interaction network (PPIN), time course gene expression data and protein sequence information for prediction of functional annotation of proteins. During progression of a particular function, it has also been observed that not all the proteins are active at all time points. For unannotated active proteins, our proposed methodology explores the dynamic PPIN consisting of level-1 and level-2 neighboring proteins at different time points, filtered by Damerau-Levenshtein edit distance to estimate the similarity between two protein sequences and coefficient variation methods to assess the strength of an edge in a network. Finally, from the filtered dynamic PPIN, at each time point, functional annotations of the level-2 proteins are assigned to the unknown and unannotated active proteins through the level-1 neighbor, following a bottom-up strategy. Our proposed methodology achieves an average precision, recall and F-Score of 0.59, 0.76 and 0.61 respectively, which is significantly higher than the reported state-of-the-art methods.
引用
收藏
页数:15
相关论文
共 33 条
[1]  
Altaf-Ul-Amin M, 2006, J COMPUT AIDED CHEM, V7, P150
[2]   Iterative cluster analysis of protein interaction data [J].
Arnau, V ;
Mars, S ;
Marín, I .
BIOINFORMATICS, 2005, 21 (03) :364-378
[3]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[4]  
Bard GregoryV., 2007, Proceedings of the fifth Australasian symposium on ACSW frontiers, V68, P117
[5]   Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks [J].
Cao, Renzhi ;
Cheng, Jianlin .
METHODS, 2016, 93 :84-91
[6]   PDP-CON: prediction of domain/linker residues in protein sequences using a consensus approach [J].
Chatterjee, Piyali ;
Basu, Subhadip ;
Zubek, Julian ;
Kundu, Mahantapas ;
Nasipuri, Mita ;
Plewczynski, Dariusz .
JOURNAL OF MOLECULAR MODELING, 2016, 22 (04)
[7]   PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines [J].
Chatterjee, Piyali ;
Basu, Subhadip ;
Kundu, Mahantapas ;
Nasipuri, Mita ;
Plewczynski, Dariusz .
JOURNAL OF MOLECULAR MODELING, 2011, 17 (09) :2191-2201
[8]   PPI_SVM: Prediction of protein-protein interactions using machine learning, domain-domain affinities and frequency tables [J].
Chatterjee, Piyali ;
Basu, Subhadip ;
Kundu, Mahantapas ;
Nasipuri, Mita ;
Plewczynski, Dariusz .
CELLULAR & MOLECULAR BIOLOGY LETTERS, 2011, 16 (02) :264-278
[9]  
Chen J, 2007, PROC INT CONF DATA, P521
[10]   Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions [J].
Chua, Hon Nian ;
Sung, Wing-Kin ;
Wong, Limsoon .
BIOINFORMATICS, 2006, 22 (13) :1623-1630