Machine Learning Guides Peptide Nucleic Acid Flow Synthesis and Sequence Design

被引:6
|
作者
Li, Chengxi [1 ,2 ,3 ]
Zhang, Genwei [1 ]
Mohapatra, Somesh [4 ]
Callahan, Alex J. [1 ]
Loas, Andrei [1 ]
Gomez-Bombarelli, Rafael [4 ]
Pentelute, Bradley L. [1 ,5 ,6 ,7 ]
机构
[1] MIT, Dept Chem, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Zhejiang Univ, Coll Chem & Biol Engn, 866 Yuhangtang Rd, Hangzhou 310030, Zhejiang, Peoples R China
[3] ZJU Hangzhou Global Sci & Technol Innovat Ctr, 733 Jianshe San Rd, Hangzhou 311200, Zhejiang, Peoples R China
[4] MIT, Dept Mat Sci & Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[5] MIT, Koch Inst Integrat Canc Res, 500 Main St, Cambridge, MA 02142 USA
[6] MIT, Ctr Environm Hlth Sci, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[7] Broad Inst MIT & Harvard, 415 Main St, Cambridge, MA 02142 USA
关键词
automated synthesis; drug design; machine learning; peptide nucleic acid; yield prediction; DISCOVERY; PREDICTION; STABILITY;
D O I
10.1002/advs.202201988
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Peptide nucleic acids (PNAs) are potential antisense therapies for genetic, acquired, and viral diseases. Efficiently selecting candidate PNA sequences for synthesis and evaluation from a genome containing hundreds to thousands of options can be challenging. To facilitate this process, this work leverages machine learning (ML) algorithms and automated synthesis technology to predict PNA synthesis efficiency and guide rational PNA sequence design. The training data is collected from individual fluorenylmethyloxycarbonyl (Fmoc) deprotection reactions performed on a fully automated PNA synthesizer. The optimized ML model allows for 93% prediction accuracy and 0.97 Pearson's r. The predicted synthesis scores are validated to be correlated with the experimental high-performance liquid chromatography (HPLC) crude purities (correlation coefficient R-2 = 0.95). Furthermore, a general applicability of ML is demonstrated through designing synthetically accessible antisense PNA sequences from 102 315 predicted candidates targeting exon 44 of the human dystrophin gene, SARS-CoV-2, HIV, as well as selected genes associated with cardiovascular diseases, type II diabetes, and various cancers. Collectively, ML provides an accurate prediction of PNA synthesis quality and serves as a useful computational tool for informing PNA sequence design.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] General nucleic acid sequence design using implicit enumeration
    Kai, Zhang
    Li, Qiang Xiao
    Ming, Zhao Dong
    Jin, Xu
    2009 FOURTH INTERNATIONAL CONFERENCE ON BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS, PROCEEDINGS, 2009, : 353 - 362
  • [42] Sequence Design for a Test Tube of Interacting Nucleic Acid Strands
    Wolfe, Brian R.
    Pierce, Niles A.
    ACS SYNTHETIC BIOLOGY, 2015, 4 (10): : 1086 - 1100
  • [43] RNAblueprint: flexible multiple target nucleic acid sequence design
    Hammer, Stefan
    Tschiatschek, Birgit
    Flamm, Christoph
    Hofacker, Ivo L.
    Findeiss, Sven
    BIOINFORMATICS, 2017, 33 (18) : 2850 - 2858
  • [44] Research on nucleic acid sequence design methods for DNA computing
    School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
    不详
    不详
    Jisuanji Xuebao, 2008, 12 (2149-2154):
  • [45] Improving nucleic acid design using biased sequence initialization
    Kayedkhordeh, Mohammad
    Bellaousov, Stanislav
    Mathews, David H.
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2015, 33 : 62 - 63
  • [46] Synthesis and characterization of naphthalimide-containing peptide nucleic acid
    Ikeda, H
    Nakamura, Y
    Saito, I
    TETRAHEDRON LETTERS, 2002, 43 (32) : 5525 - 5528
  • [47] Synthesis of Peptide Nucleic Acid Dimer Containing Modified Cytosine
    Lee, Yeong Deok
    Yoon, Kyung Sun
    Chun, Keun Ho
    BULLETIN OF THE KOREAN CHEMICAL SOCIETY, 2018, 39 (01): : 10 - 11
  • [48] Synthesis of new chiral peptide nucleic acid (PNA) monomers
    Falkiewicz, B
    Wisniowski, W
    Kolodziejczyk, AS
    Wisniewski, K
    NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS, 2001, 20 (4-7): : 1393 - 1397
  • [49] Synthesis of new peptide nucleic acid monomer with glycylglycine backbone
    Yamasaki, Tetsuo
    Abdel-Aziz, Mohamed
    Iwashita, Takashi
    Watanabe, Akiko
    Sakamoto, Masanori
    Otsuka, Masami
    JOURNAL OF HETEROCYCLIC CHEMISTRY, 2006, 43 (04) : 1111 - 1113
  • [50] Phthalimido protected peptide nucleic acid monomer - Synthesis and Characterization
    Sivakumar, Sruthi
    Ramani, Prasanna
    Shilpa, G. S.
    MATERIALS TODAY-PROCEEDINGS, 2018, 5 (08) : 16580 - 16584