A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation

被引:137
作者
Bogard, Nicholas [1 ]
Linder, Johannes [2 ]
Rosenberg, Alexander B. [1 ]
Seelig, Georg [1 ,2 ]
机构
[1] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[2] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
关键词
RNA-BINDING PROTEINS; REGULATORY ELEMENTS; MOLECULAR-MECHANISMS; SYNTHETIC BIOLOGY; SIGNAL; REVEALS; GENE; IDENTIFICATION; RECOGNITION; CLEAVAGE;
D O I
10.1016/j.cell.2019.04.046
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained our model (APARENT, APA REgression NeT) on isoform expression data from over 3 million APA reporters. APARENT's predictions are highly accurate when tasked with inferring APA in synthetic and human 3'UTRs. Visualizing features learned across all network layers reveals that APARENT recognizes sequence motifs known to recruit APA regulators, discovers previously unknown sequence determinants of 3' end processing, and integrates these features into a comprehensive, interpretable, cis-regulatory code. We apply APARENT to forward engineer functional polyadenylation signals with precisely defined cleavage position and isoform usage and validate predictions experimentally. Finally, we use APARENT to quantify the impact of genetic variants on APA. Our approach detects pathogenic variants in a wide range of disease contexts, expanding our understanding of the genetic origins of disease.
引用
收藏
页码:91 / +
页数:39
相关论文
共 86 条
[1]  
Al-Rfou R., 2016, ARXIV
[2]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[3]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[4]  
[Anonymous], 2018, bioRxiv, DOI DOI 10.1101/310375
[5]   Programmable single-cell mammalian biocomputers [J].
Auslaender, Simon ;
Auslaender, David ;
Mueller, Marius ;
Wieland, Markus ;
Fussenegger, Martin .
NATURE, 2012, 487 (7405) :123-+
[6]   A rare polyadenylation signal mutation of the FOXP3 gene (AAUAAA→AAUGAA) leads to the IPEX syndrome [J].
Bennett, CL ;
Brunkow, ME ;
Ramsdell, F ;
O'Briant, KC ;
Zhu, Q ;
Fuleihan, RL ;
Shigeoka, AO ;
Ochs, HD ;
Chance, PF .
IMMUNOGENETICS, 2001, 53 (06) :435-439
[7]   Coupling mRNA processing with transcription in time and space [J].
Bentley, David L. .
NATURE REVIEWS GENETICS, 2014, 15 (03) :163-175
[8]  
Biswas S., 2018, bioRxiv, P337154, DOI DOI 10.1101/337154
[9]   Recognition of GU-rich polyadenylation regulatory elements by human CstF-64 protein [J].
Cañadillas, JMP ;
Varani, G .
EMBO JOURNAL, 2003, 22 (11) :2821-2830
[10]   CLEAVAGE SITE DETERMINANTS IN THE MAMMALIAN POLYADENYLATION SIGNAL [J].
CHEN, F ;
MACDONALD, CC ;
WILUSZ, J .
NUCLEIC ACIDS RESEARCH, 1995, 23 (14) :2614-2620