Characterization and prediction of mRNA polyadenylation sites in human genes

被引:14
作者
Chang, Tzu-Hao [2 ]
Wu, Li-Ching [3 ]
Chen, Yu-Ting [1 ]
Huang, Hsien-Da [2 ,4 ]
Liu, Baw-Jhiune [5 ]
Cheng, Kuang-Fu [6 ,7 ]
Horng, Jorng-Tzong [1 ,3 ,8 ]
机构
[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Jhongli, Taiwan
[2] Natl Chiao Tung Univ, Inst Bioinformat & Syst Biol, Hsinchu, Taiwan
[3] Natl Cent Univ, Inst Syst Biol & Bioinformat, Jhongli, Taiwan
[4] Natl Chiao Tung Univ, Dept Biol Sci & Technol, Hsinchu, Taiwan
[5] Yuan Ze Univ, Dept Comp Sci & Informat Engn, Jhongli, Taiwan
[6] China Med Univ, Ctr Biostat, Taichung, Taiwan
[7] Natl Cent Univ, Inst Stat, Jhongli, Taiwan
[8] Asia Univ, Dept Bioinformat, Taichung, Taiwan
关键词
Bioinformatics; Data mining; Polyadenylation poly(A); Support vector machines (SVMs); ALTERNATIVE POLYADENYLATION; PROCESSING EFFICIENCY; SEQUENCE ELEMENTS; UPSTREAM; SIGNAL; SECONDARY; MECHANISM; CLEAVAGE; REGION; MOUSE;
D O I
10.1007/s11517-011-0732-4
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The accurate identification of potential poly(A) sites has contributed to all many studies with regard to alternative polyadenylation. The aim of this study was the development of a machine-learning methodology that will help to discriminate real polyadenylation signals from randomly occurring signals in genomic sequence. Since previous studies have revealed that RNA secondary structure in certain genes has significant impact, the authors tried to computationally pinpoint common structural patterns around the poly(A) sites and to investigate how RNA secondary structure may influence polyadenylation. This involved an initial study on the impact of RNA structure and it was found using motif search tools that hairpin structures might be important. Thus, it was propose that, in addition to the sequence pattern around poly(A) sites, there exists a widespread structural pattern that is also employed during human mRNA polyadenylation. In this study, the authors present a computational model that uses support vector machines to predict human poly(A) sites. The results show that this predictive model has a comparable performance to the current prediction tool. In addition, it was identified common structural patterns associated with polyadenylation using several motif finding programs and this provides new insight into the role of RNA secondary structure plays in polyadenylation.
引用
收藏
页码:463 / 472
页数:10
相关论文
共 34 条
  • [31] Downstream elements of mammalian pre-mRNA polyadenylation signals: primary, secondary and higher-order structures
    Zarudnaya, MI
    Kolomiets, IM
    Potyahaylo, AL
    Hovorun, DM
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (05) : 1375 - 1386
  • [32] Zhang M Q, 2000, Brief Bioinform, V1, P331, DOI 10.1093/bib/1.4.331
  • [33] Sequence information for the splicing of human Pre-mRNA identified by support vector machine classification
    Zhang, XHF
    Heller, KA
    Hefter, L
    Leslie, CS
    Chasin, LA
    [J]. GENOME RESEARCH, 2003, 13 (12) : 2637 - 2650
  • [34] Engineering support vector machine kernels that recognize translation initiation sites
    Zien, A
    Rätsch, G
    Mika, S
    Schölkopf, B
    Lengauer, T
    Müller, KR
    [J]. BIOINFORMATICS, 2000, 16 (09) : 799 - 807