Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery

被引:172
作者
Pyzer-Knapp, Edward O. [1 ]
Li, Kewei [1 ]
Aspuru-Guzik, Alan [1 ]
机构
[1] Dept Chem & Chem Biol, Cambridge, MA 02138 USA
基金
美国国家科学基金会;
关键词
ORGANIC PHOTOVOLTAICS; AQUEOUS SOLUBILITY; QUANTUM-CHEMISTRY; PREDICTION; DESIGN; APPROXIMATION;
D O I
10.1002/adfm.201501919
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Here, the employment of multilayer perceptrons, a type of artificial neural network, is proposed as part of a computational funneling procedure for high-throughput organic materials design. Through the use of state of the art algorithms and a large amount of data extracted from the Harvard Clean Energy Project, it is demonstrated that these methods allow a great reduction in the fraction of the screening library that is actually calculated. Neural networks can reproduce the results of quantum-chemical calculations with a large level of accuracy. The proposed approach allows to carry out large-scale molecular screening projects with less computational time. This, in turn, allows for the exploration of increasingly large and diverse libraries.
引用
收藏
页码:6495 / 6502
页数:8
相关论文
共 51 条
[1]   Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach [J].
Agarwal, Shivani ;
Dugar, Deepak ;
Sengupta, Shiladitya .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (05) :716-731
[2]   DENSITY-FUNCTIONAL EXCHANGE-ENERGY APPROXIMATION WITH CORRECT ASYMPTOTIC-BEHAVIOR [J].
BECKE, AD .
PHYSICAL REVIEW A, 1988, 38 (06) :3098-3100
[3]   Large-Scale Machine Learning with Stochastic Gradient Descent [J].
Bottou, Leon .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186
[4]   ATOM PAIRS AS MOLECULAR-FEATURES IN STRUCTURE ACTIVITY STUDIES - DEFINITION AND APPLICATIONS [J].
CARHART, RE ;
SMITH, DH ;
VENKATARAGHAVAN, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (02) :64-73
[5]  
Chilimbi Trishul M, 2014, P OSDI, V14, P571, DOI DOI 10.1108/01439911111122716
[6]  
Dahl George E, 2014, arXiv
[7]   Reoptimization of MDL keys for use in drug discovery [J].
Durant, JL ;
Leland, BA ;
Henry, DR ;
Nourse, JG .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (06) :1273-1280
[8]   BUILDING AND REFINING A KNOWLEDGE BASE FOR SYNTHETIC ORGANIC-CHEMISTRY VIA THE METHODOLOGY OF INDUCTIVE AND DEDUCTIVE MACHINE LEARNING [J].
GELERNTER, H ;
ROSE, JR ;
CHEN, CH .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1990, 30 (04) :492-504
[9]   Lead candidates for high-performance organic photovoltaics from high-throughput quantum chemistry - the Harvard Clean Energy Project [J].
Hachmann, Johannes ;
Olivares-Amaya, Roberto ;
Jinich, Adrian ;
Appleton, Anthony L. ;
Blood-Forsythe, Martin A. ;
Seress, Laszlo R. ;
Roman-Salgado, Carolina ;
Trepte, Kai ;
Atahan-Evrenk, Sule ;
Er, Sueleyman ;
Shrestha, Supriya ;
Mondal, Rajib ;
Sokolov, Anatoliy ;
Bao, Zhenan ;
Aspuru-Guzik, Alan .
ENERGY & ENVIRONMENTAL SCIENCE, 2014, 7 (02) :698-704
[10]   The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid [J].
Hachmann, Johannes ;
Olivares-Amaya, Roberto ;
Atahan-Evrenk, Sule ;
Amador-Bedolla, Carlos ;
Sanchez-Carrera, Roel S. ;
Gold-Parker, Aryeh ;
Vogt, Leslie ;
Brockway, Anna M. ;
Aspuru-Guzik, Alan .
JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2011, 2 (17) :2241-2251