DeepRF: A deep learning method for predicting metabolic pathways in organisms based on annotated genomes

被引:8
作者
Shah, Hayat Ali [1 ]
Liu, Juan [1 ]
Yang, Zhihui [1 ]
Zhang, Xiaolei [1 ]
Feng, Jing [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Inst Artificial Intelligence, Wuhan, Peoples R China
基金
国家重点研发计划;
关键词
Metabolic pathway; Deep learning; Pathway databases; Organisms; Prediction; BIOCYC COLLECTION; METACYC; DISCOVERY; DATABASE;
D O I
10.1016/j.compbiomed.2022.105756
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The rapid increase of metabolomics has led to an increasing focus on metabolic pathway modeling and reconstruction. In particular, reconstructing an organism's metabolic network based on its genome sequence is a key challenge in systems biology. The method used to address this problem predicts the presence or absence of metabolic pathways from known pathways in a reference database. However, this method is based on manual metabolic pathway construction and cannot be used for large genome sequencing data. To address such problems, we apply a supervised machine learning approach consisting of deep neural networks to learn feature representations of metabolic pathways and feed these representations into random forests to predict metabolic pathways. The supervised learning model, DeepRF, predicts all known and unknown metabolic pathways in an organism. Evaluation of DeepRF on over 318,016 instances shows that the model can predict metabolic pathways with high-performance metrics accuracy (>97%), recall (>95%), and precision (>99%). Comparing DeepRF with other methods in the literature shows that DeepRF produces more reliable results than other methods.
引用
收藏
页数:10
相关论文
共 31 条
  • [1] Aljarbou YS, 2020, INT J ADV COMPUT SC, V11, P350
  • [2] A deep learning architecture for metabolic pathway prediction
    Baranwal, Mayank
    Magner, Abram
    Elvati, Paolo
    Saldinger, Jacob
    Violi, Angela
    Hero, Alfred O.
    [J]. BIOINFORMATICS, 2020, 36 (08) : 2547 - 2553
  • [3] A Proof of Local Convergence for the Adam Optimizer
    Bock, Sebastian
    Weiss, Martin
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [4] Extremely Randomized Trees-Based Scheme for Stealthy Cyber-Attack Detection in Smart Grid Networks
    Camana, Mario R.
    Ahmed, Saeed
    Garcia, Carla E.
    Koo, Insoo
    [J]. IEEE ACCESS, 2020, 8 : 19921 - 19933
  • [5] Caspi R, 2008, NUCLEIC ACIDS RES, V36, pD623, DOI [10.1093/nar/gkm900, 10.1093/nar/gkt1103]
  • [6] MetaCyc: a multiorganism database of metabolic pathways and enzymes
    Caspi, Ron
    Foerster, Hartmut
    Fulcher, Carol A.
    Hopkinson, Rebecca
    Ingraham, John
    Kaipa, Pallavi
    Krummenacker, Markus
    Paley, Suzanne
    Pick, John
    Rhee, Seung Y.
    Tissier, Christophe
    Zhang, Peifen
    Karp, Peter D.
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 : D511 - D516
  • [7] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [8] Machine learning methods for metabolic pathway prediction
    Dale, Joseph M.
    Popescu, Liviu
    Karp, Peter D.
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [9] Pathway discovery in metabolic networks by subgraph extraction
    Faust, Karoline
    Dupont, Pierre
    Callut, Jerome
    van Helden, Jacques
    [J]. BIOINFORMATICS, 2010, 26 (09) : 1211 - 1218
  • [10] Exploring the diversity of complex metabolic networks
    Hatzimanikatis, V
    Li, CH
    Ionita, JA
    Henry, CS
    Jankowski, MD
    Broadbelt, LJ
    [J]. BIOINFORMATICS, 2005, 21 (08) : 1603 - 1609