LncRNAnet: long non-coding RNA identification using deep learning

被引:70
作者
Baek, Junghwan [1 ]
Lee, Byunghan [2 ]
Kwon, Sunyoung [2 ]
Yoon, Sungroh [1 ,2 ]
机构
[1] Seoul Natl Univ, Interdisciplinary Program Bioinformat, Seoul 08826, South Korea
[2] Seoul Natl Univ, Elect & Comp Engn, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
GENE; EVOLUTION; ANNOTATION; EXPRESSION; ALIGNMENT; FEATURES; GENCODE;
D O I
10.1093/bioinformatics/bty418
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Long non-coding RNAs (IncRNAs) are important regulatory elements in biological processes. LncRNAs share similar sequence characteristics with messenger RNAs, but they play completely different roles, thus providing novel insights for biological studies. The development of next-generation sequencing has helped in the discovery of IncRNA transcripts. However, the experimental verification of numerous transcriptomes is time consuming and costly. To alleviate these issues, a computational approach is needed to distinguish IncRNAs from the transcriptomes. Results: We present a deep learning-based approach, IncRNAnet, to identify IncRNAs that incorporates recurrent neural networks for RNA sequence modeling and convolutional neural networks for detecting stop codons to obtain an open reading frame indicator. IncRNAnet performed clearly better than the other tools for sequences of short lengths, on which most IncRNAs are distributed. In addition, IncRNAnet successfully learned features and showed 7.83%, 5.76%, 5.30% and 3.78% improvements over the alternatives on a human test set in terms of specificity, accuracy, Fl-score and area under the curve, respectively.
引用
收藏
页码:3889 / 3897
页数:9
相关论文
共 54 条
[1]   LncRNA-ID: Long non-coding RNA IDentification using balanced random forests [J].
Achawanantakun, Rujira ;
Chen, Jiao ;
Sun, Yanni ;
Zhang, Yuan .
BIOINFORMATICS, 2015, 31 (24) :3897-3905
[2]   Annotating non-coding regions of the genome [J].
Alexander, Roger P. ;
Fang, Gang ;
Rozowsky, Joel ;
Snyder, Michael ;
Gerstein, Mark B. .
NATURE REVIEWS GENETICS, 2010, 11 (08) :559-571
[3]  
[Anonymous], 2014, 52 ANN M ASS COMP LI
[4]  
[Anonymous], 2006, PATTERN RECOGN
[5]  
[Anonymous], 2016, ARXIV160309123
[6]  
[Anonymous], 2017, ADV NEURAL INFORM PR
[7]  
Baldi P., 2001, BIOINFORMATICS MACHI
[8]   Long Noncoding RNAs: Emerging Stars in Gene Regulation, Epigenetics and Human Disease [J].
Bhan, Arunoday ;
Mandal, Subhrangsu S. .
CHEMMEDCHEM, 2014, 9 (09) :1932-1956
[9]   NONCODE v3.0: integrative annotation of long noncoding RNAs [J].
Bu, Dechao ;
Yu, Kuntao ;
Sun, Silong ;
Xie, Chaoyong ;
Skogerbo, Geir ;
Miao, Ruoyu ;
Xiao, Hui ;
Liao, Qi ;
Luo, Haitao ;
Zhao, Guoguang ;
Zhao, Haitao ;
Liu, Zhiyong ;
Liu, Changning ;
Chen, Runsheng ;
Zhao, Yi .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D210-D215
[10]  
Cho K., 2015, PROPERTIES NEURAL MA, P103, DOI [10.3115/v1/w14-4012, DOI 10.3115/V1/W14-4012]