MiRPara: a SVM-based software tool for prediction of most probable microRNA coding regions in genome scale sequences

被引:138
作者
Wu, Yonggan [1 ,2 ,3 ]
Wei, Bo [1 ]
Liu, Haizhou [1 ]
Li, Tianxian [2 ]
Rayner, Simon [1 ]
机构
[1] Chinese Acad Sci, Wuhan Inst Virol, Bioinformat Grp, State Key Lab Virol, Wuhan 430071, Hubei, Peoples R China
[2] Chinese Acad Sci, State Key Lab Virol, Wuhan Inst Virol, Wuhan 430071, Hubei, Peoples R China
[3] Texas Tech Univ, Dept Biol Sci, Lubbock, TX 79409 USA
来源
BMC BIOINFORMATICS | 2011年 / 12卷
关键词
DECOMPOSITION METHODS; IDENTIFICATION; RECOGNITION; MIRBASE; TARGETS; SIRNAS; RISC;
D O I
10.1186/1471-2105-12-107
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: MicroRNAs are a family of similar to 22 nt small RNAs that can regulate gene expression at the post-transcriptional level. Identification of these molecules and their targets can aid understanding of regulatory processes. Recently, HTS has become a common identification method but there are two major limitations associated with the technique. Firstly, the method has low efficiency, with typically less than 1 in 10,000 sequences representing miRNA reads and secondly the method preferentially targets highly expressed miRNAs. If sequences are available, computational methods can provide a screening step to investigate the value of an HTS study and aid interpretation of results. However, current methods can only predict miRNAs for short fragments and have usually been trained against small datasets which don't always reflect the diversity of these molecules. Results: We have developed a software tool, miRPara, that predicts most probable mature miRNA coding regions from genome scale sequences in a species specific manner. We classified sequences from miRBase into animal, plant and overall categories and used a support vector machine to train three models based on an initial set of 77 parameters related to the physical properties of the pre-miRNA and its miRNAs. By applying parameter filtering we found a subset of similar to 25 parameters produced higher prediction ability compared to the full set. Our software achieves an accuracy of up to 80% against experimentally verified mature miRNAs, making it one of the most accurate methods available. Conclusions: miRPara is an effective tool for locating miRNAs coding regions in genome sequences and can be used as a screening step prior to HTS experiments. It is available at http://www.whiov.ac.cn/bioinformatics/mirpara
引用
收藏
页数:14
相关论文
共 44 条
[1]  
Allen E, 2005, CELL, V121, P207, DOI 10.1016/j.cell.2005.04.004
[2]  
Appasani K., 2007, microRNAs: From Basic Science to Disease Biology
[3]   miRNAminer: A tool for homologous microRNA gene search [J].
Artzi, Shay ;
Kiezun, Adam ;
Shomron, Noam .
BMC BIOINFORMATICS, 2008, 9 (1)
[4]   Identification of hundreds of conserved and nonconserved human microRNAs [J].
Bentwich, I ;
Avniel, A ;
Karov, Y ;
Aharonov, R ;
Gilad, S ;
Barad, O ;
Barzilai, A ;
Einat, P ;
Einav, U ;
Meiri, E ;
Sharon, E ;
Spector, Y ;
Bentwich, Z .
NATURE GENETICS, 2005, 37 (07) :766-770
[5]   Ab initio identification of human microRNAs based on structure motifs [J].
Brameier, Markus ;
Wiuf, Carsten .
BMC BIOINFORMATICS, 2007, 8 (1)
[6]   bantam encodes a developmentally regulated microRNA that controls cell proliferation and regulates the proapoptotic gene hid in Drosophila [J].
Brennecke, J ;
Hipfner, DR ;
Stark, A ;
Russell, RB ;
Cohen, SM .
CELL, 2003, 113 (01) :25-36
[7]   Deep sequencing of chicken microRNAs [J].
Burnside, Joan ;
Ouyang, Ming ;
Anderson, Amy ;
Bernberg, Erin ;
Lu, Cheng ;
Meyers, Blake C. ;
Green, Pamela J. ;
Markis, Milos ;
Isaacs, Grace ;
Huang, Emily ;
Morgan, Robin W. .
BMC GENOMICS, 2008, 9 (1)
[8]   GtRNAdb: a database of transfer RNA genes detected in genomic sequence [J].
Chan, Patricia P. ;
Lowe, Todd M. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D93-D97
[9]   The analysis of decomposition methods for support vector machines [J].
Chang, CC ;
Hsu, CW ;
Lin, CJ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2000, 11 (04) :1003-1008
[10]   Specialization and evolution of endogenous small RNA pathways [J].
Chapman, Elisabeth J. ;
Carrington, James C. .
NATURE REVIEWS GENETICS, 2007, 8 (11) :884-896