TargetM6A: Identifying N6-Methyladenosine Sites From RNA Sequences via Position-Specific Nucleotide Propensities and a Support Vector Machine

被引:71
作者
Li, Guang-Qing [1 ]
Liu, Zi [1 ]
Shen, Hong-Bin [2 ]
Yu, Dong-Jun [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Xiaolingwei 200, Nanjing 210094, Jiangsu, Peoples R China
[2] Shanghai Jiao Tong Univ, Inst Image Proc & Pattern Recognit, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Incremental feature selection; N-6; -; methyladenosine; position-specific nucleotide propensity; RNA methylation; support vector machine; MESSENGER-RNA; M(6)A RNA; BINDING-SITES; NUCLEAR-RNA; PREDICTION; PROTEIN; IDENTIFICATION; METHYLATION; CLASSIFIER; REVEALS;
D O I
10.1109/TNB.2016.2599115
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
As one of the most ubiquitous post-transcriptional modifications of RNA, N-6-methyladenosine (m(6)A) plays an essential role in many vital biological processes. The identification of m(6)A sites in RNAs is significantly important for both basic biomedical research and practical drug development. In this study, we designed a computational-based method, called TargetM(6)A, to rapidly and accurately target m(6)A sites solely from the primary RNA sequences. Two new features, i. e., positionspecific nucleotide/dinucleotide propensities (PSNP/PSDP), are introduced and combined with the traditional nucleotide composition (NC) feature to formulate RNA sequences. The extracted features are further optimized to obtain a much more compact and discriminative feature subset by applying an incremental feature selection (IFS) procedure. Based on the optimized feature subset, we trained TargetM(6)A on the training dataset with a support vector machine (SVM) as the prediction engine. We compared the proposed TargetM(6)A method with existing methods for predicting m(6)A sites by performing stringent jackknife tests and independent validation tests on benchmark datasets. The experimental results show that the proposed TargetM(6)A method outperformed the existing methods for predicting m(6)A sites and remarkably improved the prediction performances, with MCC = 0.526 and AUC = 0.818. We also provided a user-friendly web server for TargetM(6)A, which is publicly accessible for academic use at http://csbio. njust. edu. cn/bioinf/TargetM(6)A.
引用
收藏
页码:674 / 682
页数:9
相关论文
共 62 条
[1]   N6-methyladenosine marks primary microRNAs for processing [J].
Alarcon, Claudio R. ;
Lee, Hyeseung ;
Goodarzi, Hani ;
Halberg, Nils ;
Tavazoie, Sohail F. .
NATURE, 2015, 519 (7544) :482-+
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Towards a piRNA prediction using multiple kernel fusion and support vector machine [J].
Brayet, Jocelyn ;
Zehraoui, Farida ;
Jeanson-Leh, Laurence ;
Israeli, David ;
Tahi, Fariza .
BIOINFORMATICS, 2014, 30 (17) :I364-I370
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   The RNA modification database, RNAMDB: 2011 update [J].
Cantara, William A. ;
Crain, Pamela F. ;
Rozenski, Jef ;
McCloskey, James A. ;
Harris, Kimberly A. ;
Zhang, Xiaonong ;
Vendeix, Franck A. P. ;
Fabris, Daniele ;
Agris, Paul F. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D195-D201
[6]  
Chang C. C., 2006, ACM T INTEL SYST TEC, V2
[7]   IACP: a sequence-based tool for identifying anticancer peptides [J].
Chen, Wei ;
Ding, Hui ;
Feng, Pengmian ;
Lin, Hao ;
Chou, Kuo-Chen .
ONCOTARGET, 2016, 7 (13) :16895-16909
[8]   iRNA-Methyl: Identifying N6-methyladenosine sites using pseudo nucleotide composition [J].
Chen, Wei ;
Feng, Pengmian ;
Ding, Hui ;
Lin, Hao ;
Chou, Kuo-Chen .
ANALYTICAL BIOCHEMISTRY, 2015, 490 :26-33
[9]   Identification and analysis of the N6-methyladenosine in the Saccharomyces cerevisiae transcriptome [J].
Chen, Wei ;
Tran, Hong ;
Liang, Zhiyong ;
Lin, Hao ;
Zhang, Liqing .
SCIENTIFIC REPORTS, 2015, 5
[10]   PREDICTION OF PROTEIN STRUCTURAL CLASSES [J].
CHOU, KC ;
ZHANG, CT .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) :275-349