iSS-CNN: Identifying splicing sites using convolution neural network

被引:34
作者
Tayara, Hilal [1 ]
Tahir, Muhammad [1 ,2 ]
Chong, Kil To [1 ,3 ]
机构
[1] Chonbuk Natl Univ, Dept Elect & Informat Engn, Jeonju 54896, South Korea
[2] Abdul Wali Khan Univ, Dept Comp Sci, Mardan 23200, Pakistan
[3] Chonbuk Natl Univ, Adv Elect & Informat Res Ctr, Jeonju 54896, South Korea
基金
新加坡国家研究基金会;
关键词
Computational biology; Deep leaming; RNA; Splicing; SEQUENCE-BASED PREDICTOR; PRE-MESSENGER-RNA; TRANSLATION INITIATION SITE; MEMBRANE-PROTEIN TYPES; PSEUDO TRINUCLEOTIDE; IDENTIFICATION; DNA; GENES; PSEKNC; FEATURES;
D O I
10.1016/j.chemolab.2019.03.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
RNA splicing is an important post-transcriptional modification of eukaryotic organisms in which a single gene can code for different proteins that have different biological functions. Thus, accurate identification of RNA splicing sites sequences is important for both drugs discovery and biomedical research. However, through laboratory techniques the discrimination of the splicing sites is very expensive. Therefore, an accurate computational model is needed. In this work, we introduce an efficient convolution neural network (CNN) model called iSS-CNN for splicing sites identification. Previous methods utilized hand-crafted features for identifying splicing sites, however, the proposed model extracts the features of the splicing sites automatically using the proposed CNN model. The performance of iSS-CNN has been evaluated on benchmark datasets and produced better outcomes than the existing methods. The iSS-CNN predictor obtained 96.66% of accuracy for a dataset containing splicing donor sites (SDS) and 93.57% of accuracy for a dataset containing splicing acceptor sites (SAS) using 5-fold cross-validation test. A webserver for the iSS-CNN tool has been established and made available at https://home.jbnu.ac.kr/NSCL/iss-cnn.htm.
引用
收藏
页码:63 / 69
页数:7
相关论文
共 57 条
[1]  
[Anonymous], BIOINFORMATICS
[2]  
[Anonymous], 2017, GENOME BIOL, DOI DOI 10.1186/S13059-016-1139-1
[3]  
[Anonymous], BIOINFORMATICS
[4]  
[Anonymous], 2017, ARABIAN J SCI ENG
[5]  
[Anonymous], BIOINFORMATICS
[6]  
[Anonymous], 2018, BRIEFINGS BIOINF
[7]  
[Anonymous], CONTR C ASCC 2013 9
[8]   Convolutional neural networks for classification of alignments of non-coding RNA sequences [J].
Aoki, Genta ;
Sakakibara, Yasubumi .
BIOINFORMATICS, 2018, 34 (13) :237-244
[9]   Splice site identification using probabilistic parameters and SVM classification [J].
Baten, A. K. M. A. ;
Chang, B. C. H. ;
Halgamuge, S. K. ;
Li, Jason .
BMC BIOINFORMATICS, 2006, 7 (Suppl 5)
[10]   Prediction of locally optimal splice sites in plant pre-mRNA with applications to gene identification in Arabidopsis thaliana genomic DNA [J].
Brendel, V ;
Kleffe, J .
NUCLEIC ACIDS RESEARCH, 1998, 26 (20) :4748-4757