Probing machine learning models based on high throughput experimentation data for the discovery of asymmetric hydrogenation catalysts

被引:9
作者
Kalikadien, Adarsh V. [1 ]
Valsecchi, Cecile [2 ]
van Putten, Robbert [3 ]
Maes, Tor [3 ]
Muuronen, Mikko [3 ]
Dyubankova, Natalia [3 ]
Lefort, Laurent [3 ]
Pidko, Evgeny A. [1 ]
机构
[1] Delft Univ Technol, Fac Appl Sci, Dept Chem Engn, Inorgan Syst Engn, Van der Maasweg 9, NL-2629 HZ Delft, Netherlands
[2] Janssen Cilag SpA, Discovery Prod Dev & Supply, Viale Fulvio Testi,280-6, I-20126 Milan, Italy
[3] Janssen Pharmaceut NV, Discovery Prod Dev & Supply, Turnhoutseweg 30, B-2340 Beerse, Belgium
关键词
ORGANOMETALLIC CHEMISTRY; DESIGN; MECHANISM; LIGANDS; ENANTIOSELECTIVITY; PREDICTION; EVOLUTION; ENAMIDES; SAMBVCA; ORIGIN;
D O I
10.1039/d4sc03647f
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Enantioselective hydrogenation of olefins by Rh-based chiral catalysts has been extensively studied for more than 50 years. Naively, one would expect that everything about this transformation is known and that selecting a catalyst that induces the desired reactivity or selectivity is a trivial task. Nonetheless, ligand engineering or selection for any new prochiral olefin remains an empirical trial-error exercise. In this study, we investigated whether machine learning techniques could be used to accelerate the identification of the most efficient chiral ligand. For this purpose, we used high throughput experimentation to build a large dataset consisting of results for Rh-catalyzed asymmetric olefin hydrogenation, specially designed for applications in machine learning. We showcased its alignment with existing literature while addressing observed discrepancies. Additionally, a computational framework for the automated and reproducible quantum-chemistry based featurization of catalyst structures was created. Together with less computationally demanding representations, these descriptors were fed into our machine learning pipeline for both out-of-domain and in-domain prediction tasks of selectivity and reactivity. For out-of-domain purposes, our models provided limited efficacy. It was found that even the most expensive descriptors do not impart significant meaning to the model predictions. The in-domain application, while partly successful for predictions of conversion, emphasizes the need for evaluating the cost-benefit ratio of computationally intensive descriptors and for tailored descriptor design. Challenges persist in predicting enantioselectivity, calling for caution in interpreting results from small datasets. Our insights underscore the importance of dataset diversity with broad substrate inclusion and suggest that mechanistic considerations could improve the accuracy of statistical models. High-throughput experimentation and computational chemistry were used to build machine learning models for Rh-catalyzed asymmetric olefin hydrogenation, identifying numerous factors affecting the accuracy of selectivity and reactivity predictions.
引用
收藏
页码:13618 / 13630
页数:14
相关论文
共 93 条
[1]   Asymmetric homogeneous hydrogenations at scale [J].
Ager, David J. ;
de Vries, Andre H. M. ;
de Vries, Johannes G. .
CHEMICAL SOCIETY REVIEWS, 2012, 41 (08) :3340-3380
[2]   Design and Optimization of Catalysts Based on Mechanistic Insights Derived from Quantum Chemical Reaction Modeling [J].
Ahn, Seihwan ;
Hong, Mannkyu ;
Sundararajan, Mahesh ;
Ess, Daniel H. ;
Baik, Mu-Hyun .
CHEMICAL REVIEWS, 2019, 119 (11) :6509-6560
[3]   Predicting reaction performance in C-N cross-coupling using machine learning [J].
Ahneman, Derek T. ;
Estrada, Jesus G. ;
Lin, Shishi ;
Dreher, Spencer D. ;
Doyle, Abigail G. .
SCIENCE, 2018, 360 (6385) :186-190
[4]   Extensive re-investigations of pressure effects in rhodium-catalyzed asymmetric hydrogenations [J].
Alame, Mohamad ;
Pestre, Nathalie ;
de Bellefon, Claude .
ADVANCED SYNTHESIS & CATALYSIS, 2008, 350 (06) :898-908
[5]   Ligand Design for Asymmetric Catalysis: Combining Mechanistic and Chemoinformatics Approaches [J].
Ardkhean, Ruchuta ;
Fletcher, Stephen P. ;
Paton, Robert S. .
NEW DIRECTIONS IN THE MODELING OF ORGANOMETALLIC REACTIONS, 2020, 67 :153-189
[6]   Impact of Model Selection and Conformational Effects on the Descriptors for In Silico Screening Campaigns: A Case Study of Rh-Catalyzed Acrylate Hydrogenation [J].
Baidun, Margareth S. ;
Kalikadien, Adarsh V. ;
Lefort, Laurent ;
Pidko, Evgeny A. .
JOURNAL OF PHYSICAL CHEMISTRY C, 2024, 128 (19) :7987-7998
[7]   Machine Learning May Sometimes Simply Capture LiteraturePopularity Trends: A Case Study of Heterocyclic Suzuki-MiyauraCoupling [J].
Beker, Wiktor ;
Roszak, Rafal ;
Wolos, Agnieszka ;
Angello, Nicholas H. ;
Rathore, Vandana ;
Burke, Martin D. ;
Grzybowski, Bartosz A. .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2022, 144 (11) :4819-4827
[8]   Computational insights into metal-catalyzed asymmetric hydrogenation [J].
Besora, Maria ;
Maseras, Feliu .
METAL-CATALYZED ASYMMETRIC HYDROGENATION: EVOLUTION AND PROSPECT, 2021, 68 :385-426
[9]   Exact ligand cone angles [J].
Bilbrey, Jenna A. ;
Kazez, Arianna H. ;
Locklin, Jason ;
Allen, Wesley D. .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2013, 34 (14) :1189-1197
[10]   Asymmetric hydrogenation in industry [J].
Biosca, Maria ;
Dieguez, Montserrat ;
Zanotti-Gerosa, Antonio .
METAL-CATALYZED ASYMMETRIC HYDROGENATION: EVOLUTION AND PROSPECT, 2021, 68 :341-383