Discriminating between HuR and TTP binding sites using the k-spectrum kernel method

被引:12
作者
Bhandare, Shweta [1 ]
Goldberg, Debra S. [1 ,2 ,5 ]
Dowell, Robin [1 ,2 ,3 ,4 ]
机构
[1] Univ Colorado, Dept Comp Sci, 1111 Engn Dr, Boulder, CO 80303 USA
[2] Univ Colorado, Sch Med, Computat Biosci Program, 12801 E 17th Ave,RC1N-6129, Aurora, CO 80045 USA
[3] Univ Colorado, Dept Mol Cellular & Dev Biol, 596 UCB, Boulder, CO 80309 USA
[4] Univ Colorado, BioFrontiers Inst, 596 UCB, Boulder, CO 80309 USA
[5] Cage Free Learning, Boulder, CO 80309 USA
基金
美国国家科学基金会;
关键词
MESSENGER-RNA; SEQUENCE MOTIFS; DNA-SEQUENCE; IDENTIFICATION; PREDICTION; DATABASE;
D O I
10.1371/journal.pone.0174052
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background The RNA binding proteins (RBPs) human antigen R (HuR) and Tristetraprolin (TTP) are known to exhibit competitive binding but have opposing effects on the bound messenger RNA (mRNA). How cells discriminate between the two proteins is an interesting problem. Machine learning approaches, such as support vector machines (SVMs), may be useful in the identification of discriminative features. However, this method has yet to be applied to studies of RNA binding protein motifs. Results Applying the k-spectrum kernel to a support vector machine (SVM), we first verified the published binding sites of both HuR and TTP. Additional feature engineering highlighted the U-rich binding preference of HuR and AU-rich binding preference for TTP. Domain adaptation along with multi-task learning was used to predict the common binding sites. Conclusion The distinction between HuR and TTP binding appears to be subtle content features. HuR prefers strongly U-rich sequences whereas TTP prefers AU-rich as with increasing A content, the sequences are more likely to be bound only by TTP. Our model is consistent with competitive binding of the two proteins, particularly at intermediate AU-balanced sequences. This suggests that fine changes in the A/U balance within a untranslated region (UTR) can alter the binding and subsequent stability of the message. Both feature engineering and domain adaptation emphasized the extent to which these proteins recognize similar general sequence features. This work suggests that the k-spectrum kernel method could be useful when studying RNA binding proteins and domain adaptation techniques such as feature augmentation could be employed particularly when examining RBPs with similar binding preferences.
引用
收藏
页数:14
相关论文
共 32 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   doRiNA: a database of RNA interactions in post-transcriptional regulation [J].
Anders, Gerd ;
Mackowiak, Sebastian D. ;
Jens, Marvin ;
Maaskola, Jonas ;
Kuntzagk, Andreas ;
Rajewsky, Nikolaus ;
Landthaler, Markus ;
Dieterich, Christoph .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D180-D186
[3]  
[Anonymous], 2004, KERNEL METHODS PATTE
[4]   MEME: discovering and analyzing DNA and protein sequence motifs [J].
Bailey, Timothy L. ;
Williams, Nadya ;
Misleh, Chris ;
Li, Wilfred W. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W369-W373
[5]   DREME: motif discovery in transcription factor ChIP-seq data [J].
Bailey, Timothy L. .
BIOINFORMATICS, 2011, 27 (12) :1653-1659
[6]   ARED 3.0: the large and diverse AU-rich transcriptome [J].
Bakheet, Tala ;
Williams, Bryan R. G. ;
Khabar, Khalid S. A. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D111-D114
[7]   Support Vector Machines and Kernels for Computational Biology [J].
Ben-Hur, Asa ;
Ong, Cheng Soon ;
Sonnenburg, Soeren ;
Schoelkopf, Bernhard ;
Raetsch, Gunnar .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (10)
[8]  
Berezikov E, 2004, GENOME RES, V14, P170
[9]   Characteristics of the interaction of a synthetic human tristetraprolin tandem zinc finger peptide with AU-rich element-containing RNA substrates [J].
Blackshear, PJ ;
Lai, WS ;
Kennington, EA ;
Brewer, G ;
Wilson, GM ;
Guan, XJ ;
Zhou, P .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2003, 278 (22) :19947-19955
[10]   PARalyzer: definition of RNA binding sites from PAR-CLIP short-read sequence data [J].
Corcoran, David L. ;
Georgiev, Stoyan ;
Mukherjee, Neelanjan ;
Gottwein, Eva ;
Skalsky, Rebecca L. ;
Keene, Jack D. ;
Ohler, Uwe .
GENOME BIOLOGY, 2011, 12 (08)