circDeep: deep learning approach for circular RNA classification from other long non-coding RNA

被引:51
作者
Chaabane, Mohamed [1 ]
Williams, Robert M. [1 ]
Stephens, Austin T. [1 ]
Park, Juw Won [1 ,2 ]
机构
[1] Univ Louisville, Dept Comp Engn & Comp Sci, Louisville, KY 40208 USA
[2] Univ Louisville, KBRIN Bioinformat Core, Louisville, KY 40208 USA
关键词
D O I
10.1093/bioinformatics/btz537
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Over the past two decades, a circular form of RNA (circular RNA), produced through alternative splicing, has become the focus of scientific studies due to its major role as a microRNA (miRNA) activity modulator and its association with various diseases including cancer. Therefore, the detection of circular RNAs is vital to understanding their biogenesis and purpose. Prediction of circular RNA can be achieved in three steps: distinguishing non-coding RNAs from protein coding gene transcripts, separating short and long non-coding RNAs and predicting circular RNAs from other long non-coding RNAs (lncRNAs). However, the available tools are less than 80 percent accurate for distinguishing circular RNAs from other lncRNAs due to difficulty of classification. Therefore, the availability of a more accurate and fast machine learning method for the identification of circular RNAs, which considers the specific features of circular RNA, is essential to the development of systematic annotation. Results Here we present an End-to-End deep learning framework, circDeep, to classify circular RNA from other lncRNA. circDeep fuses an RCM descriptor, ACNN-BLSTM sequence descriptor and a conservation descriptor into high level abstraction descriptors, where the shared representations across different modalities are integrated. The experiments show that circDeep is not only faster than existing tools but also performs at an unprecedented level of accuracy by achieving a 12 percent increase in accuracy over the other tools. Availability and implementation https://github.com/UofLBioinformatics/circDeep. Supplementary information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:73 / 80
页数:8
相关论文
共 30 条
  • [1] Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics
    Asgari, Ehsaneddin
    Mofrad, Mohammad R. K.
    [J]. PLOS ONE, 2015, 10 (11):
  • [2] Correlation of circular RNA abundance with proliferation - exemplified with colorectal and ovarian cancer, idiopathic lung fibrosis, and normal human tissues
    Bachmayr-Heyda, Anna
    Reiner, Agnes T.
    Auer, Katharina
    Sukhbaatar, Nyamdelger
    Aust, Stefanie
    Bachleitner-Hofmann, Thomas
    Mesteri, Ildiko
    Grunt, Thomas W.
    Zeillinger, Robert
    Pils, Dietmar
    [J]. SCIENTIFIC REPORTS, 2015, 5 : 8057
  • [3] Expression of Linear and Novel Circular Forms of an INK4/ARF-Associated Non-Coding RNA Correlates with Atherosclerosis Risk
    Burd, Christin E.
    Jeck, William R.
    Liu, Yan
    Sanoff, Hanna K.
    Wang, Zefeng
    Sharpless, Norman E.
    [J]. PLOS GENETICS, 2010, 6 (12) : 1 - 15
  • [4] LncRNADisease: a database for long-non-coding RNA-associated diseases
    Chen, Geng
    Wang, Ziyun
    Wang, Dongqing
    Qiu, Chengxiang
    Liu, Mingxi
    Chen, Xing
    Zhang, Qipeng
    Yan, Guiying
    Cui, Qinghua
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D983 - D986
  • [5] Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection
    Chen, Lei
    Zhang, Yu-Hang
    Huang, Guohua
    Pan, Xiaoyong
    Wang, ShaoPeng
    Huang, Tao
    Cai, Yu-Dong
    [J]. MOLECULAR GENETICS AND GENOMICS, 2018, 293 (01) : 137 - 149
  • [6] Circular RNAs in Eukaryotic Cells
    Chen, Liang
    Huang, Chuan
    Wang, Xiaolin
    Shan, Ge
    [J]. CURRENT GENOMICS, 2015, 16 (05) : 312 - 318
  • [7] RNA and Disease
    Cooper, Thomas A.
    Wan, Lili
    Dreyfuss, Gideon
    [J]. CELL, 2009, 136 (04) : 777 - 793
  • [8] The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression
    Derrien, Thomas
    Johnson, Rory
    Bussotti, Giovanni
    Tanzer, Andrea
    Djebali, Sarah
    Tilgner, Hagen
    Guernec, Gregory
    Martin, David
    Merkel, Angelika
    Knowles, David G.
    Lagarde, Julien
    Veeravalli, Lavanya
    Ruan, Xiaoan
    Ruan, Yijun
    Lassmann, Timo
    Carninci, Piero
    Brown, James B.
    Lipovich, Leonard
    Gonzalez, Jose M.
    Thomas, Mark
    Davis, Carrie A.
    Shiekhattar, Ramin
    Gingeras, Thomas R.
    Hubbard, Tim J.
    Notredame, Cedric
    Harrow, Jennifer
    Guigo, Roderic
    [J]. GENOME RESEARCH, 2012, 22 (09) : 1775 - 1789
  • [9] Inverted repeats are necessary for circularization of the mouse testis Sry transcript
    Dubin, RA
    Kazmi, MA
    Ostrer, H
    [J]. GENE, 1995, 167 (1-2) : 245 - 248
  • [10] Recurrent de novo point mutations in lamin A cause Hutchinson-Gilford progeria syndrome
    Eriksson, M
    Brown, WT
    Gordon, LB
    Glynn, MW
    Singer, J
    Scott, L
    Erdos, MR
    Robbins, CM
    Moses, TY
    Berglund, P
    Dutra, A
    Pak, E
    Durkin, S
    Csoka, AB
    Boehnke, M
    Glover, TW
    Collins, FS
    [J]. NATURE, 2003, 423 (6937) : 293 - 298