Machine-learning techniques for the prediction of protein-protein interactions

被引:53
作者
Sarkar, Debasree [1 ,2 ]
Saha, Sudipto [2 ]
机构
[1] SUNY Upstate Med Univ, Syracuse, NY 13210 USA
[2] Bose Inst, Div Bioinformat, Kolkata, India
关键词
Clustering; deep learning; decision tree; machine-learning techniques; protein-protein interaction; support vector machine; SEQUENCE-BASED PREDICTION; AMINO-ACID-SEQUENCES; INTERACTION SITES; INTERACTION NETWORKS; AFFINITY PURIFICATION; INTERACTION DATABASE; INTERACTION MAP; NEURAL-NETWORK; YEAST; COMPLEXES;
D O I
10.1007/s12038-019-9909-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Protein-protein interactions (PPIs) are important for the study of protein functions and pathways involved in different biological processes, as well as for understanding the cause and progression of diseases. Several high-throughput experimental techniques have been employed for the identification of PPIs in a few model organisms, but still, there is a huge gap in identifying all possible binary PPIs in an organism. Therefore, PPI prediction using machine-learning algorithms has been used in conjunction with experimental methods for discovery of novel protein interactions. The two most popular supervised machine-learning techniques used in the prediction of PPIs are support vector machines and random forest classifiers. Bayesian-probabilistic inference has also been used but mainly for the scoring of high-throughput PPI dataset confidence measures. Recently, deep-learning algorithms have been used for sequence-based prediction of PPIs. Several clustering methods such as hierarchical and k-means are useful as unsupervised machine-learning algorithms for the prediction of interacting protein pairs without explicit data labelling. In summary, machine-learning techniques have been widely used for the prediction of PPIs thus allowing experimental researchers to study cellular PPI networks.
引用
收藏
页数:12
相关论文
共 106 条
[1]   APID interactomes: providing proteome-based interactomes with controlled quality for multiple species and derived networks [J].
Alonso-Lopez, Diego ;
Gutierrez, Miguel A. ;
Lopes, Katia P. ;
Prieto, Carlos ;
Santamaria, Rodrigo ;
De Las Rivas, Javier .
NUCLEIC ACIDS RESEARCH, 2016, 44 (W1) :W529-W535
[2]   RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences [J].
An, Ji-Yong ;
You, Zhu-Hong ;
Meng, Fan-Rong ;
Xu, Shu-Juan ;
Wang, Yin .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2016, 17 (05)
[3]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[4]  
Bader GR, 2017, HURI HUMAN REFERENCE
[5]   A New Feature Vector Based on Gene Ontology Terms for Protein-Protein Interaction Prediction [J].
Bandyopadhyay, Sanghamitra ;
Mallick, Koushik .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (04) :762-770
[6]   Network medicine: a network-based approach to human disease [J].
Barabasi, Albert-Laszlo ;
Gulbahce, Natali ;
Loscalzo, Joseph .
NATURE REVIEWS GENETICS, 2011, 12 (01) :56-68
[7]   Prediction of Intra-Species Protein-Protein Interactions in Enteropathogens Facilitating Systems Biology Study [J].
Barman, Ranjan Kumar ;
Jana, Tanmoy ;
Das, Santasabuj ;
Saha, Sudipto .
PLOS ONE, 2015, 10 (12)
[8]   Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods [J].
Barman, Ranjan Kumar ;
Saha, Sudipto ;
Das, Santasabuj .
PLOS ONE, 2014, 9 (11)
[9]   Kernel methods for predicting protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BIOINFORMATICS, 2005, 21 :I38-I46
[10]   Class prediction for high-dimensional class-imbalanced data [J].
Blagus, Rok ;
Lusa, Lara .
BMC BIOINFORMATICS, 2010, 11 :523