DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers

被引:133
作者
de Almeida, Bernardo P. [1 ,2 ,3 ]
Reiter, Franziska [1 ,2 ,3 ]
Pagani, Michaela [1 ]
Stark, Alexander [1 ,4 ]
机构
[1] Vienna BioCtr, Res Inst Mol Pathol, Campus Vienna BioCtr 1, Vienna, Austria
[2] Univ Vienna, Vienna BioCtr PhD Program, Doctoral Sch, Vienna, Austria
[3] Med Univ Vienna, Vienna, Austria
[4] Med Univ Vienna, Vienna BioCtr, Vienna, Austria
基金
欧洲研究理事会; 奥地利科学基金会;
关键词
TRANSCRIPTION FACTOR-BINDING; GENE-EXPRESSION; SYSTEMATIC DISSECTION; GENOME; SPECIFICITY; FEATURES; SITES; MECHANISMS; REPRESSION; THOUSANDS;
D O I
10.1038/s41588-022-01048-5
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been challenging. Here, we built a deep-learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally nonequivalent instances of the same TF motif that are determined by motif-flanking sequence and intermotif distances. We validated these rules experimentally and demonstrated that they can be generalized to humans by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo. A deep-learning model called DeepSTARR quantitatively predicts enhancer activity on the basis of DNA sequence. The model learns relevant motifs and syntax rules, allowing for the design of synthetic enhancers with specific strengths.
引用
收藏
页码:613 / +
页数:18
相关论文
共 111 条
[1]   Factor cooperation for chromosome discrimination in Drosophila [J].
Albig, Christian ;
Tikhonova, Evgeniya ;
Krause, Silke ;
Maksimenko, Oksana ;
Regnard, Catherine ;
Becker, Peter B. .
NUCLEIC ACIDS RESEARCH, 2019, 47 (04) :1706-1724
[2]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[3]   Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution [J].
Arnold, Cosmas D. ;
Zabidi, Muhammad A. ;
Pagani, Michaela ;
Rath, Martina ;
Schernhuber, Katharina ;
Kazmar, Tomas ;
Stark, Alexander .
NATURE BIOTECHNOLOGY, 2017, 35 (02) :136-144
[4]   Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq [J].
Arnold, Cosmas D. ;
Gerlach, Daniel ;
Stelzer, Christoph ;
Boryn, Lukasz M. ;
Rath, Martina ;
Stark, Alexander .
SCIENCE, 2013, 339 (6123) :1074-1077
[5]   Transcriptional enhancers: Intelligent enhanceosomes or flexible billboards? [J].
Arnosti, DN ;
Kulkarni, MM .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2005, 94 (05) :890-898
[6]   Effective gene expression prediction from sequence by integrating long-range interactions [J].
Avsec, Ziga ;
Agarwal, Vikram ;
Visentin, Daniel ;
Ledsam, Joseph R. ;
Grabska-Barwinska, Agnieszka ;
Taylor, Kyle R. ;
Assael, Yannis ;
Jumper, John ;
Kohli, Pushmeet ;
Kelley, David R. .
NATURE METHODS, 2021, 18 (10) :1196-+
[7]   Base-resolution models of transcription-factor binding reveal soft motif syntax [J].
Avsec, Ziga ;
Weilert, Melanie ;
Shrikumar, Avanti ;
Krueger, Sabrina ;
Alexandari, Amr ;
Dalal, Khyati ;
Fropf, Robin ;
McAnany, Charles ;
Gagneur, Julien ;
Kundaje, Anshul ;
Zeitlinger, Julia .
NATURE GENETICS, 2021, 53 (03) :354-+
[8]   The Kipoi repository accelerates community exchange and reuse of predictive models for genomics [J].
Avsec, Ziga ;
Kreuzhuber, Roman ;
Israeli, Johnny ;
Xu, Nancy ;
Cheng, Jun ;
Shrikumar, Avanti ;
Banerjee, Abhimanyu ;
Kim, Daniel S. ;
Beier, Thorsten ;
Urban, Lara ;
Kundaje, Anshul ;
Stegle, Oliver ;
Gagneur, Julien .
NATURE BIOTECHNOLOGY, 2019, 37 (06) :592-600
[9]   Prediction of histone post-translational modifications using deep learning [J].
Baisya, Dipankar Ranjan ;
Lonardi, Stefano .
BIOINFORMATICS, 2020, 36 (24) :5610-5617
[10]   EXPRESSION OF A BETA-GLOBIN GENE IS ENHANCED BY REMOTE SV40 DNA-SEQUENCES [J].
BANERJI, J ;
RUSCONI, S ;
SCHAFFNER, W .
CELL, 1981, 27 (02) :299-308