Deep learning-assisted prediction of protein-protein interactions in Arabidopsis thaliana

被引:4
作者
Zheng, Jingyan [1 ]
Yang, Xiaodi [2 ]
Huang, Yan [1 ]
Yang, Shiping [3 ]
Wuchty, Stefan [4 ,5 ,6 ,7 ]
Zhang, Ziding [1 ]
机构
[1] China Agr Univ, Coll Biol Sci, State Key Lab Anim Biotech Breeding, Beijing 100193, Peoples R China
[2] Peking Univ First Hosp, Dept Hematol, Beijing 100034, Peoples R China
[3] China Agr Univ, Coll Biol Sci, State Key Lab Plant Physiol & Biochem, Beijing 100193, Peoples R China
[4] Univ Miami, Dept Comp Sci, Miami, FL 33146 USA
[5] Univ Miami, Dept Biol, Miami, FL 33146 USA
[6] Univ Miami, Sylvester Comprehens Canc Ctr, Miami, FL 33136 USA
[7] Univ Miami, Inst Data Sci & Comp, Miami, FL 33146 USA
基金
中国国家自然科学基金;
关键词
Arabidopsis thaliana; protein-protein interaction; deep learning; prediction; GO annotation; domain; MOLECULAR-INTERACTIONS; DATABASE; NETWORKS; PLATFORM; WIDE;
D O I
10.1111/tpj.16188
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Currently, the experimentally identified interactome of Arabidopsis (Arabidopsis thaliana) is still far from complete, suggesting that computational prediction methods can complement experimental techniques. Motivated by the prosperity and success of deep learning algorithms and natural language processing techniques, we introduce an integrative deep learning framework, DeepAraPPI, allowing us to predict protein-protein interactions (PPIs) of Arabidopsis utilizing sequence, domain and Gene Ontology (GO) information. Our current DeepAraPPI comprises: (i) a word2vec encoding-based Siamese recurrent convolutional neural network (RCNN) model; (ii) a Domain2vec encoding-based multiple-layer perceptron (MLP) model; and (iii) a GO2vec encoding-based MLP model. Finally, DeepAraPPI combines the prediction results of the three individual predictors through a logistic regression model. Compiling high-quality positive and negative training and test samples by applying strict filtering strategies, DeepAraPPI shows superior performance compared with existing state-of-the-art Arabidopsis PPI prediction methods. DeepAraPPI also provides better cross-species predictive ability in rice (Oryza sativa) than traditional machine learning methods, although the overall performance in cross-species prediction remains to be improved. DeepAraPPI is freely accessible at . In the meantime, we have also made the source code and data sets of DeepAraPPI available at .
引用
收藏
页码:984 / 994
页数:11
相关论文
共 58 条
  • [1] UniProt: the universal protein knowledgebase in 2021
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Agivetova, Rahat
    Ahmad, Shadab
    Alpi, Emanuele
    Bowler-Barnett, Emily H.
    Britto, Ramona
    Bursteinas, Borisas
    Bye-A-Jee, Hema
    Coetzee, Ray
    Cukura, Austra
    Da Silva, Alan
    Denny, Paul
    Dogan, Tunca
    Ebenezer, ThankGod
    Fan, Jun
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Hussein, Abdulrahman
    Ignatchenko, Alexandr
    Insana, Giuseppe
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lock, Antonia
    Lopez, Rodrigo
    Luciani, Aurelien
    Luo, Jie
    Lussi, Yvonne
    Mac-Dougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Mishra, Alok
    Moulang, Katie
    Nightingale, Andrew
    Oliveira, Carla Susana
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Rice, Daniel
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sampson, Joseph
    [J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) : D480 - D489
  • [2] Methods for the detection and analysis of protein-protein interactions
    Berggard, Tord
    Linse, Sara
    James, Peter
    [J]. PROTEOMICS, 2007, 7 (16) : 2833 - 2842
  • [3] AtPIN: Arabidopsis thaliana Protein Interaction Network
    Brandao, Marcelo M.
    Dantas, Luiza L.
    Silva-Filho, Marcio C.
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [4] Evidence for Network Evolution in an Arabidopsis Interactome Map
    Braun, Pascal
    Carvunis, Anne-Ruxandra
    Charloteaux, Benoit
    Dreze, Matija
    Ecker, Joseph R.
    Hill, David E.
    Roth, Frederick P.
    Vidal, Marc
    Galli, Mary
    Balumuri, Padmavathi
    Bautista, Vanessa
    Chesnut, Jonathan D.
    Kim, Rosa Cheuk
    de los Reyes, Chris
    Gilles, Patrick, II
    Kim, Christopher J.
    Matrubutham, Uday
    Mirchandani, Jyotika
    Olivares, Eric
    Patnaik, Suswapna
    Quan, Rosa
    Ramaswamy, Gopalakrishna
    Shinn, Paul
    Swamilingiah, Geetha M.
    Wu, Stacy
    Ecker, Joseph R.
    Dreze, Matija
    Byrdsong, Danielle
    Dricot, Amelie
    Duarte, Melissa
    Gebreab, Fana
    Gutierrez, Bryan J.
    MacWilliams, Andrew
    Monachello, Dario
    Mukhtar, M. Shahid
    Poulin, Matthew M.
    Reichert, Patrick
    Romero, Viviana
    Tam, Stanley
    Waaijers, Selma
    Weiner, Evan M.
    Vidal, Marc
    Hill, David E.
    Braun, Pascal
    Galli, Mary
    Carvunis, Anne-Ruxandra
    Cusick, Michael E.
    Dreze, Matija
    Romero, Viviana
    Roth, Frederick P.
    [J]. SCIENCE, 2011, 333 (6042) : 601 - 607
  • [5] The BioGRID interaction database: 2017 update
    Chatr-aryamontri, Andrew
    Oughtred, Rose
    Boucher, Lorrie
    Rust, Jennifer
    Chang, Christie
    Kolas, Nadine K.
    O'Donnell, Lara
    Oster, Sara
    Theesfeld, Chandra
    Sellam, Adnane
    Stark, Chris
    Breitkreutz, Bobby-Joe
    Dolinski, Kara
    Tyers, Mike
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D369 - D379
  • [6] Multifaceted protein-protein interaction prediction based on Siamese residual RCNN
    Chen, Muhao
    Ju, Chelsea J. -T.
    Zhou, Guangyu
    Chen, Xuelu
    Zhang, Tianran
    Chang, Kai-Wei
    Zaniolo, Carlo
    Wang, Wei
    [J]. BIOINFORMATICS, 2019, 35 (14) : I305 - I314
  • [7] Cho K., 2014, P 2014 C EMPIRICAL M, P1, DOI DOI 10.3115/V1/D14-1179
  • [8] Csardi G., 2006, The igraph software package for complex network research
  • [9] AtPID:: Arabidopsis thaliana protein interactome database -: an integrative platform for plant systems biology
    Cui, Jian
    Li, Peng
    Li, Guang
    Xu, Feng
    Zhao, Chen
    Li, Yuhua
    Yang, Zhongnan
    Wang, Guang
    Yu, Qingbo
    Li, Yixue
    Shi, Tieliu
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D999 - D1008
  • [10] Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression
    De Bodt, Stefanie
    Proost, Sebastian
    Vandepoele, Klaas
    Rouze, Pierre
    Van de Peer, Yves
    [J]. BMC GENOMICS, 2009, 10 : 288