Predicting protein-protein interactions using high-quality non-interacting pairs

被引:24
|
作者
Zhang, Long [1 ]
Yu, Guoxian [1 ]
Guo, Maozu [2 ,3 ]
Wang, Jun [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing, Peoples R China
[2] Beijing Univ Civil Engn & Architecture, Sch Elect & Informat Engn, Beijing, Peoples R China
[3] Beijing Key Lab Intelligent Proc Bldg Big Data, Beijing, Peoples R China
来源
BMC BIOINFORMATICS | 2018年 / 19卷
关键词
Protein-protein interactions; Non-interacting proteins; Deep neural networks; Sequence similarity; Random walk; HYDROPHOBICITY; NETWORKS; GENOME;
D O I
10.1186/s12859-018-2525-3
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundIdentifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes. Machine learning-based approaches have been developed to predict PPIs, but the effectiveness of these approaches is unsatisfactory. One major reason is that they randomly choose non-interacting protein pairs (negative samples) or heuristically select non-interacting pairs with low quality.ResultsTo boost the effectiveness of predicting PPIs, we propose two novel approaches (NIP-SS and NIP-RW) to generate high quality non-interacting pairs based on sequence similarity and random walk, respectively. Specifically, the known PPIs collected from public databases are used to generate the positive samples. NIP-SS then selects the top-m dissimilar protein pairs as negative examples and controls the degree distribution of selected proteins to construct the negative dataset. NIP-RW performs random walk on the PPI network to update the adjacency matrix of the network, and then selects protein pairs not connected in the updated network as negative samples. Next, we use auto covariance (AC) descriptor to encode the feature information of amino acid sequences. After that, we employ deep neural networks (DNNs) to predict PPIs based on extracted features, positive and negative examples. Extensive experiments show that NIP-SS and NIP-RW can generate negative samples with higher quality than existing strategies and thus enable more accurate prediction.ConclusionsThe experimental results prove that negative datasets constructed by NIP-SS and NIP-RW can reduce the bias and have good generalization ability. NIP-SS and NIP-RW can be used as a plugin to boost the effectiveness of PPIs prediction. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NIP.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Predicting protein-protein interactions using high-quality non-interacting pairs
    Long Zhang
    Guoxian Yu
    Maozu Guo
    Jun Wang
    BMC Bioinformatics, 19
  • [2] Non-interacting surface solvation and dynamics in protein-protein interactions
    Visscher, Koen M.
    Kastritis, Panagiotis L.
    Bonvin, Alexandre M. J. J.
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2015, 83 (03) : 445 - 458
  • [3] Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset
    Shi, Ming-Guang
    Xia, Jun-Feng
    Li, Xue-Ling
    Huang, De-Shuang
    AMINO ACIDS, 2010, 38 (03) : 891 - 899
  • [4] Predicting protein-protein interactions using non-gapped interacting residue segments
    Kundrotas, Petras
    Alexov, Emil
    BIOPHYSICAL JOURNAL, 2007, : 367A - 368A
  • [5] The Negatome database: a reference set of non-interacting protein pairs
    Smialowski, Pawel
    Pagel, Philipp
    Wong, Philip
    Brauner, Barbara
    Dunger, Irmtraud
    Fobo, Gisela
    Frishman, Goar
    Montrone, Corinna
    Rattei, Thomas
    Frishman, Dmitrij
    Ruepp, Andreas
    NUCLEIC ACIDS RESEARCH, 2010, 38 : D540 - D544
  • [6] KUPS: constructing datasets of interacting and non-interacting protein pairs with associated attributions
    Chen, Xue-wen
    Jeong, Jong Cheol
    Dermyer, Patrick
    NUCLEIC ACIDS RESEARCH, 2011, 39 : D750 - D754
  • [7] Predicting protein–protein interactions from sequence using correlation coefficient and high-quality interaction dataset
    Ming-Guang Shi
    Jun-Feng Xia
    Xue-Ling Li
    De-Shuang Huang
    Amino Acids, 2010, 38 : 891 - 899
  • [8] Predicting disease genes using protein-protein interactions
    Oti, M.
    Snel, B.
    Huynen, M. A.
    Brunner, H. G.
    JOURNAL OF MEDICAL GENETICS, 2006, 43 (08) : 691 - 698
  • [9] Predicting protein-protein interactions using signature products
    Martin, S
    Roe, D
    Faulon, JL
    BIOINFORMATICS, 2005, 21 (02) : 218 - 226
  • [10] Predicting global protein-protein interactions
    Rachel Brem
    Genome Biology, 1 (1)