A Convolutional Denoising Autoencoder for Protein Scaffold Filling

被引:3
作者
Sturtz, Jordan [1 ]
Annan, Richard [1 ]
Zhu, Binhai [2 ]
Liu, Xiaowen [3 ]
Qingge, Letu [1 ]
机构
[1] North Carolina A&T State Univ, Dept Comp Sci, Greensboro, NC 27411 USA
[2] Montana State Univ, Gianforte Sch Comp, Bozeman, MT USA
[3] Tulane Univ, John W Deming Dept Med, New Orleans, LA USA
来源
BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2023 | 2023年 / 14248卷
基金
美国国家科学基金会;
关键词
De Novo Protein Sequencing; Convolutional Layer; Denoising Autoencoder; Protein Scaffold Filling; PEPTIDES;
D O I
10.1007/978-981-99-7074-2_42
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
De novo protein sequencing is a valuable task in proteomics, yet it is not a fully solved problem. Many state-of-the-art approaches use top-down and bottom-up tandem mass spectrometry (MS/MS) to sequence proteins. However, these approaches often produce protein scaffolds, which are incomplete protein sequences with gaps to fill between contiguous regions. In this paper, we propose a novel convolutional denoising autoencoder (CDA) model to perform the task of filling gaps in protein scaffolds to complete the final step of protein sequencing. We demonstrate our results both on a real dataset and eleven randomly generated datasets based on the MabCampath antibody. Our results show that the proposed CDA outperforms recently published hybrid convolutional neural network and long short-term memory (CNN-LSTM) based sequence model. We achieve 100% gap filling accuracy and 95.32% full sequence accuracy on the MabCampth protein scaffold.
引用
收藏
页码:518 / 529
页数:12
相关论文
共 14 条
[1]   Through the eye of an electrospray needle: mass spectrometric identification of the major peptides and proteins in the milk of the one-humped camel (Camelus dromedarius) [J].
Alhaider, Abdulqader ;
Abdelgader, Abdel Galil ;
Turjoman, Abdullah Arif ;
Newell, Keri ;
Hunsucker, Stephen W. ;
Shan, Baozhen ;
Ma, Bin ;
Gibson, David S. ;
Duncan, Mark W. .
JOURNAL OF MASS SPECTROMETRY, 2013, 48 (07) :779-794
[2]  
[Anonymous], 2008, P 25 INT C MACH LEAR, DOI DOI 10.1145/1390156.1390294
[3]  
Bengio Y., 2012, P ICML WORKSH UNS TR, P17, DOI DOI 10.1109/IJCNN.2011.6033302
[4]   Sequencing and Quantifying IgG Fragments and Antigen-Binding Regions by Mass Spectrometry [J].
de Costa, Dominique ;
Broodman, Ingrid ;
VanDuijn, Martijn M. ;
Stingl, Christoph ;
Dekker, Lennard J. M. ;
Burgers, Peter C. ;
Hoogsteden, Henk C. ;
Smitt, Peter A. E. Sillevis ;
van Klaveren, Rob J. ;
Luider, Theo M. .
JOURNAL OF PROTEOME RESEARCH, 2010, 9 (06) :2937-2945
[5]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[6]   De Novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra [J].
Liu, Xiaowen ;
Dekker, Lennard J. M. ;
Wu, Si ;
Vanduijn, Martijn M. ;
Luider, Theo M. ;
Tolic, Nikola ;
Kou, Qiang ;
Dvorkin, Mikhail ;
Alexandrova, Sonya ;
Vyatkina, Kira ;
Pasa-Tolic, Ljiljana ;
Pevzner, Pavel A. .
JOURNAL OF PROTEOME RESEARCH, 2014, 13 (07) :3241-3248
[7]  
National Center for Biotechnology Information Database, About us
[8]   Filling a Protein Scaffold With a Reference [J].
Qingge, Letu ;
Liu, Xiaowen ;
Zhong, Farong ;
Zhu, Binhai .
IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2017, 16 (02) :123-130
[9]   Post-translational modifications in proteins: resources, tools and prediction methods [J].
Ramazi, Shahin ;
Zahiri, Javad .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2021,
[10]   Evaluation of post-translational modifications in histone proteins: A review on histone modification defects in developmental and neurological disorders [J].
Ramazi, Shahin ;
Allahverdi, Abdollah ;
Zahiri, Javad .
JOURNAL OF BIOSCIENCES, 2020, 45 (01)