Predicting liquid chromatographic retention times of peptides from the Drosophila melanogaster proteome by machine learning approaches

被引：36

作者：

Tian, Feifei ^{[1
]}

Yang, Li ^{[1
]}

Lv, Fenglin ^{[1
]}

Zhou, Peng ^{[2
]}

机构：

[1] Chongqing Univ, Coll Bioengn, Chongqing 400044, Peoples R China

[2] Zhejiang Univ, Dept Chem, Hangzhou 310027, Peoples R China

来源：

ANALYTICA CHIMICA ACTA | 2009年 / 644卷 / 1-2期

关键词：

Least-squares support vector machine; Random forest; Gaussian process; Peptide; Liquid chromatography; Quantitative structure-retention relationship; PARTIAL LEAST-SQUARES; ARTIFICIAL NEURAL-NETWORKS; ESCHERICHIA-COLI PROTEOME; SUPPORT VECTOR MACHINE; CARLO CROSS-VALIDATION; QUANTITATIVE PREDICTION; PROTEASE DIGESTION; GAUSSIAN-PROCESSES; REGRESSION-MODELS; MS;

D O I：

10.1016/j.aca.2009.04.010

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Three machine learning algorithms as least-squares support vector machine (LSSVM), random forest (RF) and Gaussian process (GP) were used to model the quantitative structure-retention relationship (QSRR) for predicting and explaining the retention behavior of proteome-wide peptides in the reverse-phase liquid chromatography. Peptides were parameterized using CODESSA approach and 145 descriptors were obtained for each peptide, including diverse Structural information such as constitutional, topological, geometrical and physicochemical property. Based upon that, the nonlinear LSSVM, RF and GP as well as another sophisticated linear method (partial least-squares regression (PLS)) were employed in the QSRR model development. By a series of systematic validations as internal cross-validation, external test and Monte Carlo cross-validation. the stability and predictive power of the constructed models were confirmed. Results show that regression models developed using nonlinear approaches such as LSSVM, RF and GP predict better than linear PLS models. Considering the retention times used in this work were measured in different columns and thus have a relatively large uncertainty (reproducibility within 7%), the optimal statistics obtained from GP modeling are satisfactory, with the coefficients of determination (R-2) for training set and test set of 0.894 and 0.866, respectively. (C) 2009 Elsevier B.V. All rights reserved.

引用

页码：10 / 16

页数：7

共 9 条

[1] Prediction of liquid chromatographic retention times of peptides generated by protease digestion of the Escherichia coli proteome using artificial neural networks
Shinoda, Kosaku
Sugimoto, Masahiro
Yachie, Nozomu
Sugiyama, Naoyuki
Masuda, Takeshi
Robert, Martin
Soga, Tomoyoshi
Tomita, Masaru
JOURNAL OF PROTEOME RESEARCH, 2006, 5 (12) : 3312 - 3317
[2] Comprehensive comparison of eight statistical modelling methods used in quantitative structure-retention relationship studies for liquid chromatographic retention times of peptides generated by protease digestion of the Escherichia coli proteome
Zhou, Peng
Tian, Feifei
Lv, Fenglin
Shang, Zhicai
JOURNAL OF CHROMATOGRAPHY A, 2009, 1216 (15) : 3107 - 3116
[3] Machine learning for predicting retention times of chiral analytes chromatographically separated by CMPA technique
Liu, Xiong
Zhang, He
Zhou, Wei
Zhou, Yuying
Zhang, Yuexin
Cao, Xiaoliang
Liu, Muqing
Peng, Yingzi
JOURNAL OF CHROMATOGRAPHY A, 2025, 1749
[4] Insights into predicting small molecule retention times in liquid chromatography using deep learning
Liu, Yuting
Yoshizawa, Akiyasu C.
Ling, Yiwei
Okuda, Shujiro
JOURNAL OF CHEMINFORMATICS, 2024, 16 (01):
[5] A New Combination Strategy as Applied in Predicting Chromatographic Retention Times of Oligonucleotides at a Range of Temperatures from 30 °C to 80 °C
Liang, Gui-Zhao
Ma, Xiu-Yan
Chen, Yu-Zhen
Li, Yuan-Chao
Lv, Feng-Li
Yang, Li
JOURNAL OF THE CHINESE CHEMICAL SOCIETY, 2011, 58 (01) : 75 - 82
[6] Improved workflow for constructing machine learning models: Predicting retention times and peak widths in oligonucleotide separation
Samuelsson, Jorgen
Enmark, Martin
Szabados, Gergely
Rahal, Manal
Ahmed, Bestoun S.
Haggstrom, Jakob
Forssen, Patrik
Fornstedt, Torgny
JOURNAL OF CHROMATOGRAPHY A, 2025, 1747
[7] Comparing Deep Learning and Classical Machine Learning Approaches for Predicting Inpatient Violence Incidents from Clinical Text
Menger, Vincent
Scheepers, Floor
Spruit, Marco
APPLIED SCIENCES-BASEL, 2018, 8 (06):
[8] Predicting soil organic matter and soil moisture content from digital camera images: comparison of regression and machine learning approaches
Taneja, Perry
Vasava, Hiteshkumar Bhogilal
Fathololoumi, Solmaz
Daggupati, Prasad
Biswas, Asim
CANADIAN JOURNAL OF SOIL SCIENCE, 2022,
[9] Comparing Machine Learning Approaches for Predicting Spatially Explicit Life Cycle Global Warming and Eutrophication Impacts from Corn Production
Romeiko, Xiaobo Xue
Guo, Zhijian
Pang, Yulei
Lee, Eun Kyung
Zhang, Xuesong
SUSTAINABILITY, 2020, 12 (04)

← 1 →