Deep learning for peptide identification from metaproteomics datasets

被引:10
作者
Feng, Shichao [1 ]
Sterzenbach, Ryan [2 ]
Guo, Xuan [1 ]
机构
[1] Univ North Texas, Dept Comp Sci & Engn, 3940 N Elm St,Ste F290, Denton, TX 76207 USA
[2] Univ North Texas, Dept Biomed Engn, Denton, TX 76203 USA
基金
美国国家卫生研究院;
关键词
Peptide identification; Deep learning; Tandem mass spectrometry; CNN; PROTEIN IDENTIFICATION; STATISTICAL-MODEL; MS/MS; CONFIDENCE; CHALLENGES; REVEALS;
D O I
10.1016/j.jprot.2021.104316
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Metaproteomics is becoming widely used in microbiome research for gaining insights into the functional state of the microbial community. Current metaproteomics studies are generally based on high-throughput tandem mass spectrometry (MS/MS) coupled with liquid chromatography. In this paper, we proposed a deep-learningbased algorithm, named DeepFilter, for improving peptide identifications from a collection of tandem mass spectra. The key advantage of the DeepFilter is that it does not need ad hoc training or fine-tuning as in existing filtering tools. DeepFilter is freely available under the GNU GPL license at https://github. com/Biocomputing-Research-Group/DeepFilter. Significance: The identification of peptides and proteins from MS data involves the computational procedure of searching MS/MS spectra against a predefined protein sequence database and assigning top-scored peptides to spectra. Existing computational tools are still far from being able to extract all the information out of MS/MS data sets acquired from metaproteome samples. Systematical experiment results demonstrate that the DeepFilter identified up to 12% and 9% more peptide-spectrum-matches and proteins, respectively, compared with existing filtering algorithms, including Percolator, Q-ranker, PeptideProphet, and iProphet, on marine and soil microbial metaproteome samples with false discovery rate at 1%. The taxonomic analysis shows that DeepFilter found up to 7%, 10%, and 14% more species from marine, soil, and human gut samples compared with existing filtering algorithms. Therefore, DeepFilter was believed to generalize properly to new, previously unseen peptidespectrum-matches and can be readily applied in peptide identification from metaproteomics data.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Semi-supervised learning for somatic variant calling and peptide identification in personalized cancer immunotherapy
    Sherafat, Elham
    Force, Jordan
    Mandoiu, Ion I.
    BMC BIOINFORMATICS, 2020, 21 (Suppl 18)
  • [42] Combining Percolator with X!Tandem for Accurate and Sensitive Peptide Identification
    Xu, Mingguo
    Li, Zhendong
    Li, Liang
    JOURNAL OF PROTEOME RESEARCH, 2013, 12 (06) : 3026 - 3033
  • [43] Identification of Antioxidant Proteins With Deep Learning From Sequence Information
    Shao, Lifen
    Gao, Hui
    Liu, Zhen
    Feng, Juan
    Tang, Lixia
    Lin, Hao
    FRONTIERS IN PHARMACOLOGY, 2018, 9
  • [44] Deep Learning Based Language Identification System From Speech
    Athira, N. P.
    Poorna, S. S.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1094 - 1097
  • [45] Deep learning for identification of fasciculation from muscle ultrasound images
    Nodera, Hiroyuki
    Takamatsu, Naoko
    Yamazaki, Hiroki
    Satomi, Ryutaro
    Osaki, Yusuke
    Mori, Atsuko
    Izumi, Yuishin
    Kaji, Ryuji
    NEUROLOGY AND CLINICAL NEUROSCIENCE, 2019, 7 (05): : 267 - 275
  • [46] Deep Learning for the Identification of Decision Modelling Components from Text
    Goossens, Alexandre
    Claessens, Michelle
    Parthoens, Charlotte
    Vanthienen, Jan
    RULES AND REASONING, RULEML+RR 2021, 2021, 12851 : 158 - 171
  • [47] Deep Learning Methods for Virus Identification from Digital Images
    Zhang, Luxin
    Yan, Wei Qi
    2020 35TH INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2020,
  • [48] Banana ripeness stage identification: a deep learning approach
    N. Saranya
    K. Srinivasan
    S. K. Pravin Kumar
    Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 4033 - 4039
  • [49] Banana ripeness stage identification: a deep learning approach
    Saranya, N.
    Srinivasan, K.
    Kumar, S. K. Pravin
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 13 (8) : 4033 - 4039
  • [50] CRMnet: A deep learning model for predicting gene expression from large regulatory sequence datasets
    Ding, Ke
    Dixit, Gunjan
    Parker, Brian J.
    Wen, Jiayu
    FRONTIERS IN BIG DATA, 2023, 6