mCNN-ETC: identifying electron transporters and their functional families by using multiple windows scanning techniques in convolutional neural networks with evolutionary information of protein sequences

被引:17
作者
Ho, Quang-Thai [1 ,2 ]
Le, Nguyen Quoc Khanh [3 ]
Ou, Yu-Yen [4 ]
机构
[1] Yuan Ze Univ, Comp Sci Dept, Chungli, Taiwan
[2] Yuan Ze Univ, Engn Dept, Chungli, Taiwan
[3] Taipei Med Univ, Profess Master Program Artificial Intelligence Me, Taipei, Taiwan
[4] Yuan Ze Univ, Dept Comp Sci & Engn, 135 Yuan Tung Rd, Taoyuan 320, Taiwan
关键词
electron transport chain; five complexes; convolutional neural network; deep learning; position-specific scoring matrix; motif scanning; PSI-BLAST; DATABASE; ARCHITECTURE; ALIGNMENT; ACCURACY; SEARCHES; TOOL;
D O I
10.1093/bib/bbab352
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In the past decade, convolutional neural networks (CNNs) have been used as powerful tools by scientists to solve visual data tasks. However, many efforts of convolutional neural networks in solving protein function prediction and extracting useful information from protein sequences have certain limitations. In this research, we propose a new method to improve the weaknesses of the previous method. mCNN-ETC is a deep learning model which can transform the protein evolutionary information into image-like data composed of 20 channels, which correspond to the 20 amino acids in the protein sequence. We constructed CNN layers with different scanning windows in parallel to enhance the useful pattern detection ability of the proposed model. Then we filtered specific patterns through the 1-max pooling layer before inputting them into the prediction layer. This research attempts to solve a basic problem in biology in terms of application: predicting electron transporters and classifying their corresponding complexes. The performance result reached an accuracy of 97.41%, which was nearly 6% higher than its predecessor. We have also published a web server on http://bio219.bioinfo.yzu.edu.tw, which can be used for research purposes free of charge.
引用
收藏
页数:11
相关论文
共 22 条
  • [1] Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
    Alipanahi, Babak
    Delong, Andrew
    Weirauch, Matthew T.
    Frey, Brendan J.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (08) : 831 - +
  • [2] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [3] Iterated profile searches with PSI-BLAST - a tool for discovery in protein databases
    Altschul, SF
    Koonin, EV
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (11) : 444 - 447
  • [4] Electron transport chain activity is a predictor and target for venetoclax sensitivity in multiple myeloma
    Bajpai, Richa
    Sharma, Aditi
    Achreja, Abhinav
    Edgar, Claudia L.
    Wei, Changyong
    Siddiqa, Arusha A.
    Gupta, Vikas A.
    Matulis, Shannon M.
    McBrayer, Samuel K.
    Mittal, Anjali
    Rupji, Manali
    Barwick, Benjamin G.
    Lonial, Sagar
    Nooka, Ajay K.
    Boise, Lawrence H.
    Nagrath, Deepak
    Shanmugam, Mala
    [J]. NATURE COMMUNICATIONS, 2020, 11 (01)
  • [5] Devlin Jacob, 2018, ANN C N AM CHAPTER A
  • [6] MUSCLE: multiple sequence alignment with high accuracy and high throughput
    Edgar, RC
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 (05) : 1792 - 1797
  • [8] RECEPTIVE FIELDS AND FUNCTIONAL ARCHITECTURE OF MONKEY STRIATE CORTEX
    HUBEL, DH
    WIESEL, TN
    [J]. JOURNAL OF PHYSIOLOGY-LONDON, 1968, 195 (01): : 215 - &
  • [9] Highly accurate protein structure prediction with AlphaFold
    Jumper, John
    Evans, Richard
    Pritzel, Alexander
    Green, Tim
    Figurnov, Michael
    Ronneberger, Olaf
    Tunyasuvunakool, Kathryn
    Bates, Russ
    Zidek, Augustin
    Potapenko, Anna
    Bridgland, Alex
    Meyer, Clemens
    Kohl, Simon A. A.
    Ballard, Andrew J.
    Cowie, Andrew
    Romera-Paredes, Bernardino
    Nikolov, Stanislav
    Jain, Rishub
    Adler, Jonas
    Back, Trevor
    Petersen, Stig
    Reiman, David
    Clancy, Ellen
    Zielinski, Michal
    Steinegger, Martin
    Pacholska, Michalina
    Berghammer, Tamas
    Bodenstein, Sebastian
    Silver, David
    Vinyals, Oriol
    Senior, Andrew W.
    Kavukcuoglu, Koray
    Kohli, Pushmeet
    Hassabis, Demis
    [J]. NATURE, 2021, 596 (7873) : 583 - +
  • [10] Backpropagation Applied to Handwritten Zip Code Recognition
    LeCun, Y.
    Boser, B.
    Denker, J. S.
    Henderson, D.
    Howard, R. E.
    Hubbard, W.
    Jackel, L. D.
    [J]. NEURAL COMPUTATION, 1989, 1 (04) : 541 - 551