LBSizeCleav: improved support vector machine (SVM)-based prediction of Dicer cleavage sites using loop/bulge length

被引:13
作者
Bao, Yu [1 ]
Hayashida, Morihiro [1 ]
Akutsu, Tatsuya [1 ]
机构
[1] Kyoto Univ, Inst Chem Res, Bioinformat Ctr, Lab Math Bioinformat, Kyoto 6110011, Japan
关键词
Dicer cleavage site; Support vector machine; Loop/bulge length; RNA; CLASSIFICATION; PROTEIN; SVM; MICRORNAS;
D O I
10.1186/s12859-016-1353-6
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Dicer is necessary for the process of mature microRNA (miRNA) formation because the Dicer enzyme cleaves pre-miRNA correctly to generate miRNA with correct seed regions. Nonetheless, the mechanism underlying the selection of a Dicer cleavage site is still not fully understood. To date, several studies have been conducted to solve this problem, for example, a recent discovery indicates that the loop/bulge structure plays a central role in the selection of Dicer cleavage sites. In accordance with this breakthrough, a support vector machine (SVM)-based method called PHDCleav was developed to predict Dicer cleavage sites which outperforms other methods based on random forest and naive Bayes. PHDCleav, however, tests only whether a position in the shift window belongs to a loop/bulge structure. Result: In this paper, we used the length of loop/bulge structures (in addition to their presence or absence) to develop an improved method, LBSizeCleav, for predicting Dicer cleavage sites. To evaluate our method, we used 810 empirically validated sequences of human pre-miRNAs and performed fivefold cross-validation. In both 5p and 3p arms of pre-miRNAs, LBSizeCleav showed greater prediction accuracy than PHDCleav did. This result suggests that the length of loop/bulge structures is useful for prediction of Dicer cleavage sites. Conclusion: We developed a novel algorithm for feature space mapping based on the length of a loop/bulge for predicting Dicer cleavage sites. The better performance of our method indicates the usefulness of the length of loop/bulge structures for such predictions.
引用
收藏
页数:11
相关论文
共 27 条
[1]   Bioactive Compounds and Antioxidant Activity of Fresh and Processed White Cauliflower [J].
Ahmed, Fouad A. ;
Ali, Rehab F. M. .
BIOMED RESEARCH INTERNATIONAL, 2013, 2013
[2]   MicroRNAs: Genomics, biogenesis, mechanism, and function (Reprinted from Cell, vol 116, pg 281-297, 2004) [J].
Bartel, David P. .
CELL, 2007, 131 (04) :11-29
[3]   Role for a bidentate ribonuclease in the initiation step of RNA interference [J].
Bernstein, E ;
Caudy, AA ;
Hammond, SM ;
Hannon, GJ .
NATURE, 2001, 409 (6818) :363-366
[4]   ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST [J].
Bhasin, M ;
Raghava, GPS .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W414-W419
[5]   Drug design by machine learning: support vector machines for pharmaceutical data analysis [J].
Burbidge, R ;
Trotter, M ;
Buxton, B ;
Holden, S .
COMPUTERS & CHEMISTRY, 2001, 26 (01) :5-14
[6]   Comparison of support vector machine and artificial neural network systems for drug/nondrug classification [J].
Byvatov, E ;
Fechner, U ;
Sadowski, J ;
Schneider, G .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (06) :1882-1889
[7]   SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence [J].
Cai, CZ ;
Han, LY ;
Ji, ZL ;
Chen, X ;
Chen, YZ .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3692-3697
[8]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[9]   Support vector machines for spam categorization [J].
Drucker, H ;
Wu, DH ;
Vapnik, VN .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05) :1048-1054
[10]   Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells [J].
Elbashir, SM ;
Harborth, J ;
Lendeckel, W ;
Yalcin, A ;
Weber, K ;
Tuschl, T .
NATURE, 2001, 411 (6836) :494-498