Towards a piRNA prediction using multiple kernel fusion and support vector machine

被引:38
作者
Brayet, Jocelyn [1 ,2 ]
Zehraoui, Farida [1 ]
Jeanson-Leh, Laurence [2 ]
Israeli, David [2 ]
Tahi, Fariza [1 ]
机构
[1] IBGBI, UEVE Genopole, IBISC EA 4526, F-91000 Evry, France
[2] Genethon, F-91002 Evry, France
关键词
SMALL RNAS; PIWI; PROTEIN; CLASSIFICATION; GERMLINE; MILI;
D O I
10.1093/bioinformatics/btu441
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Piwi-interacting RNA (piRNA) is the most recently discovered and the least investigated class of Argonaute/Piwi protein-interacting small non-coding RNAs. The piRNAs are mostly known to be involved in protecting the genome from invasive transposable elements. But recent discoveries suggest their involvement in the pathophysiology of diseases, such as cancer. Their identification is therefore an important task, and computational methods are needed. However, the lack of conserved piRNA sequences and structural elements makes this identification challenging and difficult. Results: In the present study, we propose a new modular and extensible machine learning method based on multiple kernels and a support vector machine (SVM) classifier for piRNA identification. Very few piRNA features are known to date. The use of a multiple kernels approach allows editing, adding or removing piRNA features that can be heterogeneous in a modular manner according to their relevance in a given species. Our algorithm is based on a combination of the previously identified features [sequence features (k-mer motifs and a uridine at the first position) and piRNAs cluster feature] and a new telomere/centromere vicinity feature. These features are heterogeneous, and the kernels allow to unify their representation. The proposed algorithm, named piRPred, gives promising results on Drosophila and Human data and outscores previously published piRNA identification algorithms.
引用
收藏
页码:I364 / I370
页数:7
相关论文
共 31 条
[1]  
[Anonymous], 2009, Proceedings of the 26th Annual International Conference on Machine Learning, DOI DOI 10.1145/1553374.1553510
[2]  
[Anonymous], 2012, Proceedings of the fifteenth International Conference on Artificial Intelligence and Statistics
[3]  
[Anonymous], 2013, INT C MACH LEARN
[4]   A novel class of small RNAs bind to MILI protein in mouse testes [J].
Aravin, Alexei ;
Gaidatzis, Dimos ;
Pfeffer, Sebastien ;
Lagos-Quintana, Mariana ;
Landgraf, Pablo ;
Iovino, Nicola ;
Morris, Patricia ;
Brownstein, Michael J. ;
Kuramochi-Miyagawa, Satomi ;
Nakano, Toru ;
Chien, Minchen ;
Russo, James J. ;
Ju, Jingyue ;
Sheridan, Robert ;
Sander, Chris ;
Zavolan, Mihaela ;
Tuschl, Thomas .
NATURE, 2006, 442 (7099) :203-207
[5]  
Betel D., 2013, CANC LETT, V336, P46
[6]   Nonmonotone spectral projected gradient methods on convex sets [J].
Birgin, EG ;
Martínez, JM ;
Raydan, M .
SIAM JOURNAL ON OPTIMIZATION, 2000, 10 (04) :1196-1211
[7]  
Brennecke J, 2007, CELL, V128, P1089, DOI 10.1016/j.cell.2007.01.043
[8]   MIWI2 is essential for spermatogenesis and repression of transposons in the mouse male germline [J].
Carmell, Michelle A. ;
Girard, Angelique ;
van de Kant, Henk J. G. ;
Bourc'his, Deborah ;
Bestor, Timothy H. ;
de Rooij, Dirk G. ;
Hannon, Gregory J. .
DEVELOPMENTAL CELL, 2007, 12 (04) :503-514
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482