Predicting reaction performance in C-N cross-coupling using machine learning

被引:530
作者
Ahneman, Derek T. [1 ]
Estrada, Jesus G. [1 ]
Lin, Shishi [2 ]
Dreher, Spencer D. [2 ]
Doyle, Abigail G. [1 ]
机构
[1] Princeton Univ, Dept Chem, Princeton, NJ 08544 USA
[2] Merck Sharp & Dohme Corp, Chem Capabil & Screening, Kenilworth, NJ 07033 USA
关键词
DISCOVERY; TOOL;
D O I
10.1126/science.aar5169
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Machine learning methods are becoming integral to scientific inquiry in numerous disciplines. We demonstrated that machine learning can be used to predict the performance of a synthetic reaction in multidimensional chemical space using data obtained via high-throughput experimentation. We created scripts to compute and extract atomic, molecular, and vibrational descriptors for the components of a palladium-catalyzed Buchwald-Hartwig cross-coupling of aryl halides with 4-methylaniline in the presence of various potentially inhibitory additives. Using these descriptors as inputs and reaction yield as output, we showed that a random forest algorithm provides significantly improved predictive performance over linear regression analysis. The random forest model was also successfully applied to sparse training sets and out-of-sample prediction, suggesting its value in facilitating adoption of synthetic methodology.
引用
收藏
页码:186 / 190
页数:5
相关论文
共 35 条
[1]   Designer substrate library for quantitative, predictive modeling of reaction performance [J].
Bess, Elizabeth N. ;
Bischoff, Amanda J. ;
Sigman, Matthew S. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2014, 111 (41) :14698-14703
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]   Prediction of Organic Reaction Outcomes Using Machine Learning [J].
Coley, Connor W. ;
Barzilay, Regina ;
Jaakkola, Tommi S. ;
Green, William H. ;
Jensen, Klays F. .
ACS CENTRAL SCIENCE, 2017, 3 (05) :434-443
[4]   Intermolecular Reaction Screening as a Tool for Reaction Evaluation [J].
Collins, Karl D. ;
Glorius, Frank .
ACCOUNTS OF CHEMICAL RESEARCH, 2015, 48 (03) :619-627
[5]  
Collins KD, 2014, NAT CHEM, V6, P859, DOI [10.1038/NCHEM.2062, 10.1038/nchem.2062]
[6]   Activity cliffs in drug discovery: Dr Jekyll or Mr Hyde? [J].
Cruz-Monteagudo, Maykel ;
Medina-Francos, Jose L. ;
Perez-Castillo, Yunierkis ;
Nicolotti, Orazio ;
Cordeiro, M. Natalia D. S. ;
Borges, Fernanda .
DRUG DISCOVERY TODAY, 2014, 19 (08) :1069-1080
[7]   A Systematic Investigation of Quaternary Ammonium Ions as Asymmetric Phase-Transfer Catalysts. Application of Quantitative Structure Activity/Selectivity Relationships [J].
Denmark, Scott E. ;
Gould, Nathan D. ;
Wolf, Larry M. .
JOURNAL OF ORGANIC CHEMISTRY, 2011, 76 (11) :4337-4357
[8]  
Draper N. R., 1998, Applied regression analysis, DOI DOI 10.1002/9781118625590.CH15
[9]   Ligand-Free-Palladium-Catalyzed Direct 4-Arylation of Isoxazoles Using Aryl Bromides [J].
Fall, Yacoub ;
Reynaud, Celine ;
Doucet, Henri ;
Santelli, Maurice .
EUROPEAN JOURNAL OF ORGANIC CHEMISTRY, 2009, 2009 (24) :4041-4050
[10]   The effect of structure upon the reactions of organic compounds benzene derivatives [J].
Hammett, LP .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1937, 59 :96-103