DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers

被引:133
作者
de Almeida, Bernardo P. [1 ,2 ,3 ]
Reiter, Franziska [1 ,2 ,3 ]
Pagani, Michaela [1 ]
Stark, Alexander [1 ,4 ]
机构
[1] Vienna BioCtr, Res Inst Mol Pathol, Campus Vienna BioCtr 1, Vienna, Austria
[2] Univ Vienna, Vienna BioCtr PhD Program, Doctoral Sch, Vienna, Austria
[3] Med Univ Vienna, Vienna, Austria
[4] Med Univ Vienna, Vienna BioCtr, Vienna, Austria
基金
欧洲研究理事会; 奥地利科学基金会;
关键词
TRANSCRIPTION FACTOR-BINDING; GENE-EXPRESSION; SYSTEMATIC DISSECTION; GENOME; SPECIFICITY; FEATURES; SITES; MECHANISMS; REPRESSION; THOUSANDS;
D O I
10.1038/s41588-022-01048-5
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been challenging. Here, we built a deep-learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally nonequivalent instances of the same TF motif that are determined by motif-flanking sequence and intermotif distances. We validated these rules experimentally and demonstrated that they can be generalized to humans by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo. A deep-learning model called DeepSTARR quantitatively predicts enhancer activity on the basis of DNA sequence. The model learns relevant motifs and syntax rules, allowing for the design of synthetic enhancers with specific strengths.
引用
收藏
页码:613 / +
页数:18
相关论文
共 111 条
[11]   Predicting gene expression from sequence [J].
Beer, MA ;
Tavazoie, S .
CELL, 2004, 117 (02) :185-198
[12]   Computational identification of developmental enhancers:: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura -: art. no. R61 [J].
Berman, BP ;
Pfeiffer, BD ;
Laverty, TR ;
Salzberg, SL ;
Rubin, GM ;
Eisen, MB ;
Celniker, SE .
GENOME BIOLOGY, 2004, 5 (09)
[13]   A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation [J].
Bogard, Nicholas ;
Linder, Johannes ;
Rosenberg, Alexander B. ;
Seelig, Georg .
CELL, 2019, 178 (01) :91-+
[14]   The role of PU.1 and GATA-1 transcription factors during normal and leukemogenic hematopoiesis [J].
Burda, P. ;
Laslo, P. ;
Stopka, T. .
LEUKEMIA, 2010, 24 (07) :1249-1257
[15]   Assessing sufficiency and necessity of enhancer activities for gene expression and the mechanisms of transcription activation [J].
Catarino, Rui R. ;
Stark, Alexander .
GENES & DEVELOPMENT, 2018, 32 (3-4) :202-223
[16]   DNA Binding by GATA Transcription Factor Suggests Mechanisms of DNA Looping and Long-Range Gene Regulation [J].
Chen, Yongheng ;
Bates, Darren L. ;
Dey, Raja ;
Chen, Po-Han ;
Machado, Ana Carolina Dantas ;
Laird-Offringa, Ite A. ;
Rohs, Remo ;
Chen, Lin .
CELL REPORTS, 2012, 2 (05) :1197-1206
[17]   Computational Identification of Diverse Mechanisms Underlying Transcription Factor-DNA Occupancy [J].
Cheng, Qiong ;
Kazemian, Majid ;
Hannah Pham ;
Blatti, Charles ;
Celniker, Susan E. ;
Wolfe, Scot A. ;
Brodsky, Michael H. ;
Sinha, Saurabh .
PLOS GENETICS, 2013, 9 (08)
[18]   Using synthetic biology to study gene regulatory evolution [J].
Crocker, Justin ;
Ilsley, Garth R. .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2017, 47 :91-101
[19]   The Soft Touch: Low-Affinity Transcription Factor Binding Sites in Development and Evolution [J].
Crocker, Justin ;
Preger-Ben Noon, Ella ;
Stern, David L. .
ESSAYS ON DEVELOPMENTAL BIOLOGY, PT B, 2016, 117 :455-+
[20]   Quantitatively predictable control of Drosophila transcriptional enhancers in vivo with engineered transcription factors [J].
Crocker, Justin ;
Ilsley, Garth R. ;
Stern, David L. .
NATURE GENETICS, 2016, 48 (03) :292-298