Structure Based Machine Learning Prediction of Retention Times for LC Method Development of Pharmaceuticals

被引:3
作者
Fine, Jonathan [1 ]
Mann, Amanda K. Peterson [1 ]
Aggarwal, Pankaj [1 ]
机构
[1] Merck & Co Inc, Analyt Res & Dev, MRL, Rahway, NJ 07065 USA
关键词
Chromatography; Machine learning; Method development; QSRR; MODEL;
D O I
10.1007/s11095-023-03646-2
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
PurposeSignificant resources are spent on developing robust liquid chromatography (LC) methods with optimum conditions for all project in the pipeline. Although, data-driven computer assisted modelling has been implemented to shorten the method development timelines, these modelling approaches require project-specific screening data to model retention time (RT) as function of method parameters. Sometimes method re-development is required, leading to additional investments and redundant laboratory work. Cheminformatics techniques have been successfully used to predict the RT of metabolites & other component mixtures for similar use cases. Here we will show that these techniques can be used to model structurally diverse molecules and predictions of these models trained on multiple LC conditions can be used for downstream data-driven modelling.MethodsThe Molecular Operating Environment (MOE) was used to calculate over 800 descriptors using the strucutres of the analytes. These descriptors were used to model the RT of the analytes under four chromatographic conditions. These models were then used to create data-driven models using LC-SIM.ResultsA structural-based Random Forest (RF) model outperformed other techniques in cross-validation studies and predicted the RTs of a randomized test set with a median percentage error less than 4% for all LC conditions. RTs predicted by this structure-based model were used to fit a data-driven model that identifies optimum LC conditions without any additional experimental work.ConclusionsThese results show that small training sets yield pharmaceutically relevant models when used in a combination of structure-based and data-driven model.
引用
收藏
页码:365 / 374
页数:10
相关论文
共 27 条
  • [1] In Silico Multifactorial Modeling for Streamlined Development and Optimization of Two-Dimensional Liquid Chromatography
    Ahmad, Imad A. Haidar
    Makey, Devin M.
    Wang, Heather
    Shchurik, Vladimir
    Singh, Andrew N.
    Stoll, Dwight R.
    Mangion, Ian
    Regalado, Erik L.
    [J]. ANALYTICAL CHEMISTRY, 2021, 93 (33) : 11532 - 11539
  • [2] In silico method development for the reversed-phase liquid chromatography separation of proteins using chaotropic mobile phase modifiers
    Ahmad, Imad A. Haidar
    Bennett, Raffeal
    Makey, Devin
    Shchurik, Vladimir
    Lhotka, Hayley
    Mann, Benjamin F.
    McClain, Ray
    Lu, Tian
    Hua, Xiaoqing
    Strulson, Christopher A.
    Loughney, John W.
    Mangion, Ian
    Makarov, Alexey A.
    Regalado, Erik L.
    [J]. JOURNAL OF CHROMATOGRAPHY B-ANALYTICAL TECHNOLOGIES IN THE BIOMEDICAL AND LIFE SCIENCES, 2021, 1173
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Consequences of sample size, variable selection, and model validation and optimisation, for predicting classification ability from analytical data
    Brereton, Richard G.
    [J]. TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2006, 25 (11) : 1103 - 1111
  • [5] The METLIN small molecule dataset for machine learning-based retention time prediction
    Domingo-Almenara, Xavier
    Guijas, Carlos
    Billings, Elizabeth
    Montenegro-Burke, J. Rafael
    Uritboonthai, Winnie
    Aisporna, Aries E.
    Chen, Emily
    Benton, H. Paul
    Siuzdak, Gary
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [6] Graph-based machine learning interprets and predicts diagnostic isomer-selective ion-molecule reactions in tandem mass spectrometry
    Fine, Jonathan
    Liu, Judy Kuan-Yu
    Beck, Armen
    Alzarieni, Kawthar Z.
    Ma, Xin
    Boulos, Victoria M.
    Kenttamaa, Hilkka, I
    Chopra, Gaurav
    [J]. CHEMICAL SCIENCE, 2020, 11 (43) : 11849 - 11858
  • [7] Perspective on the Future Approaches to Predict Retention in Liquid Chromatography
    Gritti, Fabrice
    [J]. ANALYTICAL CHEMISTRY, 2021, 93 (14) : 5653 - 5664
  • [8] Prediction of Analyte Retention Time in Liquid Chromatography
    Haddad, Paul R.
    Taraji, Maryam
    Szucs, Roman
    [J]. ANALYTICAL CHEMISTRY, 2021, 93 (01) : 228 - 256
  • [9] Haghighatlari M, 2019, DETERMINACION NIVEL, P1, DOI [10.26434/chemrxiv.8796947.v2, DOI 10.26434/CHEMRXIV.8796947.V2]
  • [10] Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830