共 50 条
Development of Peptide Identification System for ToF-SIMS Spectra Using Supervised Machine Learning
被引:0
|作者:
Aoyagi, Satoka
[1
]
Fujita, Miya
[2
]
Itoh, Hidemi
[3
]
Itoh, Hiroto
[4
]
Nagatomi, Takaharu
[3
]
Okamoto, Masayuki
[5
]
Ueno, Tomikazu
[2
]
机构:
[1] Seikei Univ, Fac Sci & Technol, Musashino, Tokyo 1808633, Japan
[2] JSR Corp, Yokaichi, Mie 5108552, Japan
[3] Asahi Kasei Corp, Platform Lab Sci & Technol, Fuji, Shizuoka 4168501, Japan
[4] Kon Minolta Inc, Data Generat Div, Data Sci Ctr, Mat Sci Grp, Tokyo 1007015, Japan
[5] Kao Corp, Analyt Sci Res Lab, Wakayama, Wakayama 6408580, Japan
关键词:
SIMS;
peptide identification;
amino acid sequence;
Random Forest;
ION MASS-SPECTROMETRY;
ADSORBED PROTEIN FILMS;
D O I:
10.1021/jasms.4c00310
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Time-of-flight secondary ion mass spectrometry (ToF-SIMS) data interpretation for organic materials is complicated because of various fragment ions produced from each molecule and the overlapping of certain mass peaks from different molecules. Fragmentation mechanisms in SIMS are complex because different sputtering and ionization processes can simultaneously occur. Therefore, a prediction system that can identify materials in a sample is required. A novel prediction system for peptides based on ToF-SIMS and amino-acid-based teaching information (labels) for supervised machine learning was developed. To develop the prediction system for general organic materials, the annotation of materials is crucial to creating effective labels for supervised learning. Peptides are composed of 20 amino acid residues, which can be used as labels. We previously developed a peptide prediction system using Random Forest, a supervised machine-learning method. However, only the amino acids contained in the target peptide were predicted, and the amino acid sequence was unable to be assumed. In this study, the amino acid sequence of the test peptide was determined by adding the information on two adjacent amino acids to the labels. Once the prediction system learned the target peptide spectra, the peptides in the newly obtained ToF-SIMS spectra could be identified. The new prediction system also provides useful information for the identification of unknown peptides. The prediction results indicate that two adjacent permutations of amino acids are effective pieces of teaching information for expressing the amino acid sequence of a peptide.
引用
收藏
页码:3057 / 3062
页数:6
相关论文