LncRNAnet: long non-coding RNA identification using deep learning

被引:69
作者
Baek, Junghwan [1 ]
Lee, Byunghan [2 ]
Kwon, Sunyoung [2 ]
Yoon, Sungroh [1 ,2 ]
机构
[1] Seoul Natl Univ, Interdisciplinary Program Bioinformat, Seoul 08826, South Korea
[2] Seoul Natl Univ, Elect & Comp Engn, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
GENE; EVOLUTION; ANNOTATION; EXPRESSION; ALIGNMENT; FEATURES; GENCODE;
D O I
10.1093/bioinformatics/bty418
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Long non-coding RNAs (IncRNAs) are important regulatory elements in biological processes. LncRNAs share similar sequence characteristics with messenger RNAs, but they play completely different roles, thus providing novel insights for biological studies. The development of next-generation sequencing has helped in the discovery of IncRNA transcripts. However, the experimental verification of numerous transcriptomes is time consuming and costly. To alleviate these issues, a computational approach is needed to distinguish IncRNAs from the transcriptomes. Results: We present a deep learning-based approach, IncRNAnet, to identify IncRNAs that incorporates recurrent neural networks for RNA sequence modeling and convolutional neural networks for detecting stop codons to obtain an open reading frame indicator. IncRNAnet performed clearly better than the other tools for sequences of short lengths, on which most IncRNAs are distributed. In addition, IncRNAnet successfully learned features and showed 7.83%, 5.76%, 5.30% and 3.78% improvements over the alternatives on a human test set in terms of specificity, accuracy, Fl-score and area under the curve, respectively.
引用
收藏
页码:3889 / 3897
页数:9
相关论文
共 54 条
  • [1] LncRNA-ID: Long non-coding RNA IDentification using balanced random forests
    Achawanantakun, Rujira
    Chen, Jiao
    Sun, Yanni
    Zhang, Yuan
    [J]. BIOINFORMATICS, 2015, 31 (24) : 3897 - 3905
  • [2] Annotating non-coding regions of the genome
    Alexander, Roger P.
    Fang, Gang
    Rozowsky, Joel
    Snyder, Michael
    Gerstein, Mark B.
    [J]. NATURE REVIEWS GENETICS, 2010, 11 (08) : 559 - 571
  • [3] [Anonymous], 2014, 52 ANN M ASS COMP LI
  • [4] [Anonymous], 2006, PATTERN RECOGN
  • [5] [Anonymous], 2016, ARXIV160309123
  • [6] [Anonymous], 2017, ADV NEURAL INFORM PR
  • [7] Baldi P., 2001, BIOINFORMATICS MACHI
  • [8] Long Noncoding RNAs: Emerging Stars in Gene Regulation, Epigenetics and Human Disease
    Bhan, Arunoday
    Mandal, Subhrangsu S.
    [J]. CHEMMEDCHEM, 2014, 9 (09) : 1932 - 1956
  • [9] NONCODE v3.0: integrative annotation of long noncoding RNAs
    Bu, Dechao
    Yu, Kuntao
    Sun, Silong
    Xie, Chaoyong
    Skogerbo, Geir
    Miao, Ruoyu
    Xiao, Hui
    Liao, Qi
    Luo, Haitao
    Zhao, Guoguang
    Zhao, Haitao
    Liu, Zhiyong
    Liu, Changning
    Chen, Runsheng
    Zhao, Yi
    [J]. NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) : D210 - D215
  • [10] Cho K., 2015, PROPERTIES NEURAL MA, P103, DOI [10.3115/v1/w14-4012, DOI 10.3115/V1/W14-4012]