Improved modeling of RNA-binding protein motifs in an interpretable neural model of RNA splicing

被引:5
作者
Gupta, Kavi [1 ]
Yang, Chenxi [2 ]
Mccue, Kayla [3 ]
Bastani, Osbert [4 ]
Sharp, Phillip A. [3 ,5 ]
Burge, Christopher B. [3 ]
Solar-Lezama, Armando [1 ]
机构
[1] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
[2] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[3] MIT, Dept Biol, Cambridge, MA 02139 USA
[4] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[5] MIT, Koch Inst Integrat Canc Res, Cambridge, MA 02139 USA
基金
美国国家科学基金会;
关键词
Alternative splicing; Genome interpretation; Machine learning; Neural network; RNA processing; RNA-binding protein; Variant interpretation; SEQUENCE; RECOGNITION; REPRESSION; PREDICTION; ACTIVATION; SR;
D O I
10.1186/s13059-023-03162-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Sequence-specific RNA-binding proteins (RBPs) play central roles in splicing decisions. Here, we describe a modular splicing architecture that leverages in vitro-derived RNA affinity models for 79 human RBPs and the annotated human genome to produce improved models of RBP binding and activity. Binding and activity are modeled by separate Motif and Aggregator components that can be mixed and matched, enforcing sparsity to improve interpretability. Training a new Adjusted Motif (AM) architecture on the splicing task not only yields better splicing predictions but also improves prediction of RBP-binding sites in vivo and of splicing activity, assessed using independent data.
引用
收藏
页数:23
相关论文
共 46 条
[1]   The GTEx Consortium atlas of genetic regulatory effects across human tissues [J].
Aguet, Francois ;
Barbeira, Alvaro N. ;
Bonazzola, Rodrigo ;
Brown, Andrew ;
Castel, Stephane E. ;
Jo, Brian ;
Kasela, Silva ;
Kim-Hellmuth, Sarah ;
Liang, Yanyu ;
Parsana, Princy ;
Flynn, Elise ;
Fresard, Laure ;
Gamazon, Eric R. ;
Hamel, Andrew R. ;
He, Yuan ;
Hormozdiari, Farhad ;
Mohammadi, Pejman ;
Munoz-Aguirre, Manuel ;
Ardlie, Kristin G. ;
Battle, Alexis ;
Bonazzola, Rodrigo ;
Brown, Christopher D. ;
Cox, Nancy ;
Dermitzakis, Emmanouil T. ;
Engelhardt, Barbara E. ;
Garrido-Martin, Diego ;
Gay, Nicole R. ;
Getz, Gad ;
Guigo, Roderic ;
Hamel, Andrew R. ;
Handsaker, Robert E. ;
He, Yuan ;
Hoffman, Paul J. ;
Hormozdiari, Farhad ;
Im, Hae Kyung ;
Jo, Brian ;
Kasela, Silva ;
Kashin, Seva ;
Kim-Hellmuth, Sarah ;
Kwong, Alan ;
Lappalainen, Tuuli ;
Li, Xiao ;
Liang, Yanyu ;
MacArthur, Daniel G. ;
Mohammadi, Pejman ;
Montgomery, Stephen B. ;
Munoz-Aguirre, Manuel ;
Rouhana, John M. ;
Hormozdiari, Farhad ;
Im, Hae Kyung .
SCIENCE, 2020, 369 (6509) :1318-1330
[2]  
Aytar Y, 2017, Arxiv, DOI arXiv:1706.00932
[3]   Deciphering the splicing code [J].
Barash, Yoseph ;
Calarco, John A. ;
Gao, Weijun ;
Pan, Qun ;
Wang, Xinchen ;
Shai, Ofer ;
Blencowe, Benjamin J. ;
Frey, Brendan J. .
NATURE, 2010, 465 (7294) :53-59
[4]   COSSMO: predicting competitive alternative splice site selection using deep learning [J].
Bretschneider, Hannes ;
Gandhi, Shreshth ;
Deshwar, Amit G. ;
Zuberi, Khalid ;
Frey, Brendan J. .
BIOINFORMATICS, 2018, 34 (13) :429-437
[5]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[6]   Concept whitening for interpretable image recognition [J].
Chen, Zhi ;
Bei, Yijie ;
Rudin, Cynthia .
NATURE MACHINE INTELLIGENCE, 2020, 2 (12) :772-782
[7]   MTSplice predicts effects of genetic variants on tissue-specific splicing [J].
Cheng, Jun ;
Celik, Muhammed Hasan ;
Kundaje, Anshul ;
Gagneur, Julien .
GENOME BIOLOGY, 2021, 22 (01)
[8]   MMSplice: modular modeling improves the predictions of genetic variant effects on splicing [J].
Cheng, Jun ;
Thi Yen Duong Nguyen ;
Cygan, Kamil J. ;
Celik, Muhammed Hasan ;
Fairbrother, William G. ;
Avsec, Ziga ;
Gagneur, Julien .
GENOME BIOLOGY, 2019, 20 (1)
[9]  
Desjardins G, 2012, Arxiv, DOI arXiv:1210.5474
[10]   Sequence, Structure, and Context Preferences of Human RNA Binding Proteins [J].
Dominguez, Daniel ;
Freese, Peter ;
Alexis, Maria S. ;
Su, Amanda ;
Hochman, Myles ;
Palden, Tsultrim ;
Bazile, Cassandra ;
Lambert, Nicole J. ;
Van Nostrand, Eric L. ;
Pratt, Gabriel A. ;
Yeo, Gene W. ;
Graveley, Brenton R. ;
Burge, Christopher B. .
MOLECULAR CELL, 2018, 70 (05) :854-+