Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models

被引:345
|
作者
Liu, Bowen [1 ]
Ramsundar, Bharath [2 ]
Kawthekar, Prasad [2 ]
Shi, Jade [1 ]
Gomes, Joseph [1 ]
Quang Luu Nguyen [1 ]
Ho, Stephen [1 ]
Sloane, Jack [1 ]
Wender, Paul [1 ,3 ]
Pande, Vijay [1 ,2 ,4 ]
机构
[1] Stanford Univ, Dept Chem, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Chem & Syst Biol, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Biol Struct, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
ORGANIC-CHEMISTRY; AUTOMATED DISCOVERY; SYNTHETIC ANALYSIS; CHEMICAL-REACTIONS; KNOWLEDGE-BASE; COMPUTER; DESIGN; LANGUAGE; SYSTEM; METHODOLOGY;
D O I
10.1021/acscentsci.7b00303
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step toward solving the challenging problem of computational retrosynthetic analysis.
引用
收藏
页码:1103 / 1113
页数:11
相关论文
共 50 条
  • [1] Retrosynthetic and Synthetic Reaction Prediction Model Based on Sequence-to-Sequence with Attention for Polymer Designs
    Taniwaki, Hiroaki
    Kaneko, Hiromasa
    MACROMOLECULAR THEORY AND SIMULATIONS, 2023, 32 (04)
  • [2] Neural AMR: Sequence-to-Sequence Models for Parsing and Generation
    Konstas, Ioannis
    Iyer, Srinivasan
    Yatskar, Mark
    Choi, Yejin
    Zettlemoyer, Luke
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 146 - 157
  • [3] Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
    Kreutzer, Julia
    Sokolov, Artem
    Riezler, Stefan
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1503 - 1513
  • [4] Neural Abstractive Text Summarization with Sequence-to-Sequence Models
    Shi, Tian
    Keneshloo, Yaser
    Ramakrishnan, Naren
    Reddy, Chandan K.
    ACM/IMS Transactions on Data Science, 2021, 2 (01):
  • [5] Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning
    Mohan, Devang S. Ram
    Lenain, Raphael
    Foglianti, Lorenzo
    Teh, Tian Huey
    Staib, Marlene
    Torresquintero, Alexandra
    Gao, Jiameng
    INTERSPEECH 2020, 2020, : 3186 - 3190
  • [6] Guiding Attention in Sequence-to-Sequence Models for Dialogue Act prediction
    Colombo, Pierre
    Chapuis, Emile
    Manica, Matteo
    Vignon, Emmanuel
    Varni, Giovanna
    Clavel, Chloe
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7594 - 7601
  • [7] Sparse Sequence-to-Sequence Models
    Peters, Ben
    Niculae, Vlad
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1504 - 1519
  • [8] Prediction of discharge in a tidal river using the LSTM-based sequence-to-sequence models
    Zhigao Chen
    Yan Zong
    Zihao Wu
    Zhiyu Kuang
    Shengping Wang
    Acta Oceanologica Sinica, 2024, 43 (07) : 40 - 51
  • [9] Demo: Vessel Trajectory Prediction using Sequence-to-Sequence Models over Spatial Grid
    Duc-Duy Nguyen
    Van, Chan Le
    Ali, Muhammad Intizar
    DEBS'18: PROCEEDINGS OF THE 12TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED AND EVENT-BASED SYSTEMS, 2018, : 258 - 261
  • [10] Prediction of discharge in a tidal river using the LSTM-based sequence-to-sequence models
    Chen, Zhigao
    Zong, Yan
    Wu, Zihao
    Kuang, Zhiyu
    Wang, Shengping
    ACTA OCEANOLOGICA SINICA, 2024, 43 (07) : 40 - 51