A convolution neural network-based computational model to identify the occurrence sites of various RNA modifications by fusing varied features

被引：10

作者：

Tahir, Muhammad ^{[1
]}

Hayat, Maqsood ^{[1
]}

Chong, Kil To ^{[2
,3
]}

机构：

[1] Abdul Wali Khan Univ Mardan, Dept Comp Sci, Mardan 23200, KP, Pakistan

[2] Jeonbuk Natl Univ, Dept Elect & Informat Engn, Jeonju 54896, South Korea

[3] Jeonbuk Natl Univ, Adv Elect & Informat Res Ctr, Jeonju 54896, South Korea

来源：

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS | 2021年 / 211卷

基金：

新加坡国家研究基金会;

关键词：

Deep learning; RNA Modifications; k-Gram; Feature extraction; Convolution neural network; Data processing; SEQUENCE-BASED PREDICTOR; N-6-METHYLADENOSINE SITES; N6-METHYLADENOSINE SITES; METHYLATION; 5-METHYLCYTOSINE; PROTEINS;

D O I：

10.1016/j.chemolab.2021.104233

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

RNA modification occurs in both prokaryotic and eukaryotic genomes, which is considered one of the major RNA properties. RNA modifications are the main portions of the regulatory landscape found in genes, which contain several bioprocesses at the post-transcriptional level. Therefore, the identification of RNA modifications residue information is essential for determining their molecular functions and their relevant mechanisms. Although the wet lab experimental works for identification of RNA modification sites have produced satisfactory results, these experimental-based approaches are highly labor-intensive and precious. So, it is indispensable to establish a novel and robust computational approach for the prediction of RNA modification sites. To solve these issues, an intelligent computational predictor called ?iRNA-Mod-CNN?, using deep learning hypotheses is developed to identify RNA modification sites. First, the biological sequences are encoded by implementing the one-hot encoding method. Then encoded feature vector is provided to the convolution neural network (CNN) model in order to discern the conceal information. Further, k-Gram feature space is amalgamated with CNN feature space. The computational predictor ?iRNA-Mod-CNN? showed significant improvement over the existing methods, producing 99.56%, 92.39%, and 86.66% of accuracies on m1A, m6A, and m5C benchmark datasets, respectively.

引用

页数：6

共 66 条

[11] Identification and analysis of the N6-methyladenosine in the Saccharomyces cerevisiae transcriptome [J].