Deep learning for peptide identification from metaproteomics datasets

被引:9
作者
Feng, Shichao [1 ]
Sterzenbach, Ryan [2 ]
Guo, Xuan [1 ]
机构
[1] Univ North Texas, Dept Comp Sci & Engn, 3940 N Elm St,Ste F290, Denton, TX 76207 USA
[2] Univ North Texas, Dept Biomed Engn, Denton, TX 76203 USA
基金
美国国家卫生研究院;
关键词
Peptide identification; Deep learning; Tandem mass spectrometry; CNN; PROTEIN IDENTIFICATION; STATISTICAL-MODEL; MS/MS; CONFIDENCE; CHALLENGES; REVEALS;
D O I
10.1016/j.jprot.2021.104316
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Metaproteomics is becoming widely used in microbiome research for gaining insights into the functional state of the microbial community. Current metaproteomics studies are generally based on high-throughput tandem mass spectrometry (MS/MS) coupled with liquid chromatography. In this paper, we proposed a deep-learningbased algorithm, named DeepFilter, for improving peptide identifications from a collection of tandem mass spectra. The key advantage of the DeepFilter is that it does not need ad hoc training or fine-tuning as in existing filtering tools. DeepFilter is freely available under the GNU GPL license at https://github. com/Biocomputing-Research-Group/DeepFilter. Significance: The identification of peptides and proteins from MS data involves the computational procedure of searching MS/MS spectra against a predefined protein sequence database and assigning top-scored peptides to spectra. Existing computational tools are still far from being able to extract all the information out of MS/MS data sets acquired from metaproteome samples. Systematical experiment results demonstrate that the DeepFilter identified up to 12% and 9% more peptide-spectrum-matches and proteins, respectively, compared with existing filtering algorithms, including Percolator, Q-ranker, PeptideProphet, and iProphet, on marine and soil microbial metaproteome samples with false discovery rate at 1%. The taxonomic analysis shows that DeepFilter found up to 7%, 10%, and 14% more species from marine, soil, and human gut samples compared with existing filtering algorithms. Therefore, DeepFilter was believed to generalize properly to new, previously unseen peptidespectrum-matches and can be readily applied in peptide identification from metaproteomics data.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Robust Learning of Deep Predictive Models from Noisy and Imbalanced Software Engineering Datasets
    Li, Zhong
    Pan, Minxue
    Pei, Yu
    Zhang, Tian
    Wang, Linzhang
    Li, Xuandong
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [32] Unveiling Cellular Antenna Orientations from Large Crowdsourced Datasets: A Deep Learning Approach
    Eller, Lukas
    Svoboda, Philipp
    Rupp, Markus
    2022 18TH INTERNATIONAL CONFERENCE ON WIRELESS AND MOBILE COMPUTING, NETWORKING AND COMMUNICATIONS (WIMOB), 2022,
  • [33] Preparing CT imaging datasets for deep learning in lung nodule analysis: Insights from four well-known datasets
    Wang, Jingxuan
    Sourlos, Nikos
    Zheng, Sunyi
    Van der Velden, Nils
    Pelgrim, Gert Jan
    Vliegenthart, Rozemarijn
    van Ooijen, Peter
    HELIYON, 2023, 9 (06)
  • [34] Breaking the data barrier: a review of deep learning techniques for democratizing AI with small datasets
    Rather, Ishfaq Hussain
    Kumar, Sushil
    Gandomi, Amir H.
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (09)
  • [35] Video Fire Detection Methods Based on Deep Learning: Datasets, Methods, and Future Directions
    Jin, Chengtuo
    Wang, Tao
    Alhusaini, Naji
    Zhao, Shenghui
    Liu, Huilin
    Xu, Kun
    Zhang, Jin
    Chen, Tao
    FIRE-SWITZERLAND, 2023, 6 (08):
  • [36] Context-Based Feature Technique for Sarcasm Identification in Benchmark Datasets Using Deep Learning and BERT Model
    Eke, Christopher Ifeanyi
    Norman, Azah Anir
    Shuib, Liyana
    IEEE ACCESS, 2021, 9 (09): : 48501 - 48518
  • [37] Learning from Small Datasets: An Efficient Deep Learning Model for Covid-19 Detection from Chest X-ray Using Dataset Distillation Technique
    Musa, Aminu
    Adam, Fatima Muhammad
    Ibrahim, Umar
    Zandam, Abubakar Yakubu
    2022 IEEE NIGERIA 4TH INTERNATIONAL CONFERENCE ON DISRUPTIVE TECHNOLOGIES FOR SUSTAINABLE DEVELOPMENT (IEEE NIGERCON), 2022, : 481 - 486
  • [38] Malignant Melanoma Classification Using Deep Learning: Datasets, Performance Measurements, Challenges and Opportunities
    Naeem, Ahmad
    Farooq, Muhammad Shoaib
    Khelifi, Adel
    Abid, Adnan
    IEEE ACCESS, 2020, 8 : 110575 - 110597
  • [39] Transfer Learning and Fine-Tuning for Deep Learning-Based Tea Diseases Detection on Small Datasets
    Ramdan, Ade
    Heryana, Ana
    Arisal, Andria
    Kusumo, R. Budiarianto S.
    Pardede, Hilman F.
    2020 INTERNATIONAL CONFERENCE ON RADAR, ANTENNA, MICROWAVE, ELECTRONICS, AND TELECOMMUNICATIONS (ICRAMET): FOSTERING INNOVATION THROUGH ICTS FOR SUSTAINABLE SMART SOCIETY, 2020, : 206 - 211
  • [40] Combining Percolator with X!Tandem for Accurate and Sensitive Peptide Identification
    Xu, Mingguo
    Li, Zhendong
    Li, Liang
    JOURNAL OF PROTEOME RESEARCH, 2013, 12 (06) : 3026 - 3033