Mechanistic Inference from Statistical Models at Different Data-Size Regimes

被引:19
|
作者
Lustosa, Danilo M. [1 ]
Milo, Anat [1 ]
机构
[1] Ben Gurion Univ Negev, Dept Chem, IL-84105 Beer Sheva, Israel
基金
以色列科学基金会;
关键词
statistics; mechanism; molecular descriptors; data set design; data visualization; cheminformatics; machine learning; FREE-ENERGY RELATIONSHIPS; AIDED SYNTHESIS DESIGN; RATE-DETERMINING STEP; N-SUBSTITUENT SIZES; C-H ACTIVATION; THROUGHPUT EXPERIMENTATION; ORGANOMETALLIC CHEMISTRY; ALKALINE-HYDROLYSIS; CHELATING P; P-DONOR; CONJUGATE ADDITION;
D O I
10.1021/acscatal.2c01741
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
The chemical sciences are witnessing an influx of statistics into the catalysis literature. These developments are propelled by modern technological advancements that are leading to fast and reliable data production, mining, and management. In organic chemistry, models encoded with information-rich parameters have facilitated the formulation of mechanistic hypotheses across different data-size regimes. Herein, we aim to demonstrate through selected examples that the integration of statistical principles into homogeneous catalysis can streamline not only reaction optimization protocols but also mechanistic investigation procedures. Namely, we highlight how different aspects of molecular modeling, data set design, data visualization, and nuanced data restructuring can contribute to improving chemical reactivity and selectivity, while furthering our understanding of reaction mechanisms. By mapping out these techniques at different data set sizes, we hope to encourage the broad application of data-driven approaches for mechanistic studies regardless of the accessible amount of data.
引用
收藏
页码:7886 / 7906
页数:21
相关论文
共 14 条
  • [11] I Don't Have That Much Data! Reusing User Behavior Models for Websites from Different Domains
    Bakaev, Maxim
    Speicher, Maximilian
    Heil, Sebastian
    Gaedke, Martin
    WEB ENGINEERING, ICWE 2020, 2020, 12128 : 146 - 162
  • [12] Common Audiological Functional Parameters (CAFPAs) for single patient cases: deriving statistical models from an expert-labelled data set
    Buhl, Mareike
    Warzybok, Anna
    Schaedler, Marc Rene
    Majdani, Omid
    Kollmeier, Birger
    INTERNATIONAL JOURNAL OF AUDIOLOGY, 2020, 59 (07) : 534 - 547
  • [13] Estimating PM2.5 from multisource data: A comparison of different machine learning models in the Pearl River Delta of China
    Tian, Hao
    Zhao, Yongquan
    Luo, Ming
    He, Qingqing
    Han, Yu
    Zeng, Zhaoliang
    URBAN CLIMATE, 2021, 35
  • [14] Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models
    Poirier, Canelle
    Liu, Dianbo
    Clemente, Leonardo
    Ding, Xiyu
    Chinazzi, Matteo
    Davis, Jessica
    Vespignani, Alessandro
    Santillana, Mauricio
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (08)