Tuning the molecular weight distribution from atom transfer radical polymerization using deep reinforcement learning

被引:52
作者
Li, Haichen [1 ,2 ]
Collins, Christopher R. [1 ]
Ribelli, Thomas G. [1 ]
Matyjaszewski, Krzysztof [1 ]
Gordon, Geoffrey J. [2 ]
Kowalewski, Tomasz [1 ]
Yaron, David J. [1 ]
机构
[1] Carnegie Mellon Univ, Dept Chem, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
关键词
MONTE-CARLO-SIMULATION; SOLUTION ARGET ATRP; NEURAL-NETWORKS; BATCH PROCESSES; DYNAMIC OPTIMIZATION; ELECTRON-TRANSFER; BLOCK-COPOLYMERS; RATE CONSTANTS; STAR POLYMERS; POLYDISPERSITY;
D O I
10.1039/c7me00131b
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
We devise a novel technique to control the shape of polymer molecular weight distributions (MWDs) in atom transfer radical polymerization (ATRP). This technique makes use of recent advances in both simulation-based, model-free reinforcement learning (RL) and the numerical simulation of ATRP. A simulation of ATRP is built that allows an RL controller to add chemical reagents throughout the course of the reaction. The RL controller incorporates fully-connected and convolutional neural network architectures and bases its decision upon the current status of the ATRP reaction. The initial, untrained, controller leads to ending MWDs with large variability, allowing the RL algorithm to explore a large search space. When trained using an actor-critic algorithm, the RL controller is able to discover and optimize control policies that lead to a variety of target MWDs. The target MWDs include Gaussians of various width, and more diverse shapes such as bimodal distributions. The learned control policies are robust and transfer to similar but not identical ATRP reaction settings, even under the presence of simulated noise. We believe this work is a proof-of-concept for employing modern artificial intelligence techniques in the synthesis of new functional polymer materials.
引用
收藏
页码:496 / 508
页数:13
相关论文
共 149 条
[1]   Dynamic Monte Carlo simulation of atom-transfer radical polymerization [J].
Al-Harthi, Mamdouh ;
Soares, Joao B. R. ;
Simon, Leonardo C. .
MACROMOLECULAR MATERIALS AND ENGINEERING, 2006, 291 (08) :993-1003
[2]  
[Anonymous], 2016, International Conference on Machine Learning
[3]  
[Anonymous], 1977, TECHNICAL REPORT EPI
[4]  
Arulkumaran K., 2017, ARXIV170805866
[5]   Model Reference Neural-Fuzzy Adaptive Control of the Concentration in a Chemical Reactor (CSTR) [J].
Bahita, M. ;
Belarbi, K. .
IFAC PAPERSONLINE, 2016, 49 (29) :158-162
[6]  
Bakker B., 2002, ADV NEURAL INF PROCE, P1475
[7]  
Barrett S., 2010, 9 INT C AUT AG MULT
[8]  
Binder T, 2001, ONLINE OPTIMIZATION OF LARGE SCALE SYSTEMS, P295
[9]  
Bokovi J. D., 1995, AUTOMATICA, V31, P817
[10]   Closing the learning-planning loop with predictive state representations [J].
Boots, Byron ;
Siddiqi, Sajid M. ;
Gordon, Geoffrey J. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2011, 30 (07) :954-966