NetQuilt: deep multispecies network-based protein function prediction using homology-informed network similarity

被引:14
作者
Barot, Meet [1 ]
Gligorijevic, Vladimir [2 ]
Cho, Kyunghyun [1 ]
Bonneau, Richard [1 ,2 ]
机构
[1] NYU, Ctr Data Sci, New York, NY 10011 USA
[2] Flatiron Inst, Ctr Computat Biol, New York, NY 10010 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
MAXIMIZING ACCURACY; GLOBAL ALIGNMENT; SEQUENCE;
D O I
10.1093/bioinformatics/btab098
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Transferring knowledge between species is challenging: different species contain distinct proteomes and cellular architectures, which cause their proteins to carry out different functions via different interaction networks. Many approaches to protein functional annotation use sequence similarity to transfer knowledge between species. These approaches cannot produce accurate predictions for proteins without homologues of known function, as many functions require cellular context for meaningful prediction. To supply this context, network-based methods use protein-protein interaction (PPI) networks as a source of information for inferring protein function and have demonstrated promising results in function prediction. However, most of these methods are tied to a network for a single species, and many species lack biological networks. Results: In this work, we integrate sequence and network information across multiple species by computing IsoRank similarity scores to create a meta-network profile of the proteins of multiple species. We use this integrated multispecies meta-network as input to train a maxout neural network with Gene Ontology terms as target labels. Our multispecies approach takes advantage of more training examples, and consequently leads to significant improvements in function prediction performance compared to two network-based methods, a deep learning sequence-based method and the BLAST annotation method used in the Critial Assessment of Functional Annotation. We are able to demonstrate that our approach performs well even in cases where a species has no network information available: when an organism's PPI network is left out we can use our multi-species method to make predictions for the left-out organism with good performance.
引用
收藏
页码:2414 / 2422
页数:9
相关论文
共 47 条
  • [1] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [2] Identifying protein complexes and functional modules-from static PPI networks to dynamic PPI networks
    Chen, Bolin
    Fan, Weiwei
    Liu, Juan
    Wu, Fang-Xiang
    [J]. BRIEFINGS IN BIOINFORMATICS, 2014, 15 (02) : 177 - 194
  • [3] Compact Integration of Multi-Network Topology for Functional Analysis of Genes
    Cho, Hyunghoon
    Berger, Bonnie
    Peng, Jian
    [J]. CELL SYSTEMS, 2016, 3 (06) : 540 - +
  • [4] Chollet F., 2015, KERAS 20 COMPUTER SO
  • [5] FFPred 3: feature-based function prediction for all Gene Ontology domains
    Cozzetto, Domenico
    Minneci, Federico
    Currant, Hannah
    Jones, David T.
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [6] Duchi J, 2011, J MACH LEARN RES, V12, P2121
  • [7] The post-genomic era of biological network alignment
    Faisal, Fazle E
    Meng, Lei
    Crawford, Joseph
    Milenković, Tijana
    [J]. Eurasip Journal on Bioinformatics and Systems Biology, 2015, 2015 (01)
  • [8] Functional protein representations from biological networks enable diverse cross-species inference
    Fan, Jason
    Cannistra, Anthony
    Fried, Inbar
    Lim, Tim
    Schaffner, Thomas
    Crovella, Mark
    Hescott, Benjamin
    Leiserson, Mark D. M.
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (09)
  • [9] Automated protein function prediction - the genomic challenge
    Friedberg, Iddo
    [J]. BRIEFINGS IN BIOINFORMATICS, 2006, 7 (03) : 225 - 242
  • [10] deepNF: deep network fusion for protein function prediction
    Gligorijevic, Vladimir
    Barot, Meet
    Bonneau, Richard
    [J]. BIOINFORMATICS, 2018, 34 (22) : 3873 - 3881