Accurate assembly of circular RNAs with TERRACE

被引:0
作者
Zahin, Tasfia [1 ]
Shi, Qian [1 ,2 ]
Zang, Xiaofei Carl
Shao, Mingfu [1 ,2 ]
机构
[1] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[2] Penn State Univ, Huck Inst Life Sci, University Pk, PA 16802 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
LANDSCAPE; ABUNDANT;
D O I
10.1101/gr.279106.124
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Circular RNA (circRNA) is a class of RNA molecules that forms a closed loop with their 5 ' and 3 ' ends covalently bonded. CircRNAs are known to be more stable than linear RNAs, have distinct properties and functions, and are promising biomarkers. Existing methods for assembling circRNAs heavily rely on the annotated transcriptomes, hence exhibiting unsatisfactory accuracy without a high-quality transcriptome. We present TERRACE, a new algorithm for full-length assembly of circRNAs from paired-end total RNA-seq data. TERRACE uses the splice graph as the underlying data structure that organizes the splicing and coverage information. We transform the problem of assembling circRNAs into finding paths that "bridge" the three fragments in the splice graph induced by back-spliced reads. We adopt a definition for optimal bridging paths and a dynamic programming algorithm to calculate such optimal paths. TERRACE features an efficient algorithm to detect back-spliced reads missed by RNA-seq aligners, contributing to its much-improved sensitivity. It also incorporates a new machine-learning approach trained to assign a confidence score to each assembled circRNA, which is shown to be superior to using abundance for scoring. On both simulations and biological data sets, TERRACE consistently outperforms existing methods by a large margin in sensitivity while achieving better or comparable precision. In particular, when the annotations are not provided, TERRACE assembles 123%-413% more correct circRNAs than state-of-the-art methods. TERRACE presents a significant advance in assembling full-length circRNAs from RNA-seq data, and we expect it to be widely used in future research on circRNAs.
引用
收藏
页码:1365 / 1370
页数:6
相关论文
共 37 条
  • [1] CircMiner: accurate and rapid detection of circular RNA through splice-aware pseudo-alignment scheme
    Asghari, Hossein
    Lin, Yen-Yi
    Xu, Yang
    Haghshenas, Ehsan
    Collins, Colin C.
    Hach, Faraz
    [J]. BIOINFORMATICS, 2020, 36 (12) : 3703 - 3711
  • [2] Clough E, 2016, METHODS MOL BIOL, V1418, P93, DOI 10.1007/978-1-4939-3578-9_5
  • [3] STAR: ultrafast universal RNA-seq aligner
    Dobin, Alexander
    Davis, Carrie A.
    Schlesinger, Felix
    Drenkow, Jorg
    Zaleski, Chris
    Jha, Sonali
    Batut, Philippe
    Chaisson, Mark
    Gingeras, Thomas R.
    [J]. BIOINFORMATICS, 2013, 29 (01) : 15 - 21
  • [4] An integrated encyclopedia of DNA elements in the human genome
    Dunham, Ian
    Kundaje, Anshul
    Aldred, Shelley F.
    Collins, Patrick J.
    Davis, CarrieA.
    Doyle, Francis
    Epstein, Charles B.
    Frietze, Seth
    Harrow, Jennifer
    Kaul, Rajinder
    Khatun, Jainab
    Lajoie, Bryan R.
    Landt, Stephen G.
    Lee, Bum-Kyu
    Pauli, Florencia
    Rosenbloom, Kate R.
    Sabo, Peter
    Safi, Alexias
    Sanyal, Amartya
    Shoresh, Noam
    Simon, Jeremy M.
    Song, Lingyun
    Trinklein, Nathan D.
    Altshuler, Robert C.
    Birney, Ewan
    Brown, James B.
    Cheng, Chao
    Djebali, Sarah
    Dong, Xianjun
    Dunham, Ian
    Ernst, Jason
    Furey, Terrence S.
    Gerstein, Mark
    Giardine, Belinda
    Greven, Melissa
    Hardison, Ross C.
    Harris, Robert S.
    Herrero, Javier
    Hoffman, Michael M.
    Iyer, Sowmya
    Kellis, Manolis
    Khatun, Jainab
    Kheradpour, Pouya
    Kundaje, Anshul
    Lassmann, Timo
    Li, Qunhua
    Lin, Xinying
    Marinov, Georgi K.
    Merkel, Angelika
    Mortazavi, Ali
    [J]. NATURE, 2012, 489 (7414) : 57 - 74
  • [5] Circular RNA identification based on multiple seed matching
    Gao, Yuan
    Zhang, Jinyang
    Zhao, Fangqing
    [J]. BRIEFINGS IN BIOINFORMATICS, 2018, 19 (05) : 803 - 810
  • [6] Comprehensive identification of internal structure and alternative splicing events in circular RNAs
    Gao, Yuan
    Wang, Jinfeng
    Zheng, Yi
    Zhang, Jinyang
    Chen, Shuai
    Zhao, Fangqing
    [J]. NATURE COMMUNICATIONS, 2016, 7
  • [7] CIRI: an efficient and unbiased algorithm for de novo circular RNA identification
    Gao, Yuan
    Wang, Jinfeng
    Zhao, Fangqing
    [J]. GENOME BIOLOGY, 2015, 16
  • [8] Detecting and characterizing circular RNAs
    Jeck, William R.
    Sharpless, Norman E.
    [J]. NATURE BIOTECHNOLOGY, 2014, 32 (05) : 453 - 461
  • [9] Circular RNAs are abundant, conserved, and associated with ALU repeats
    Jeck, William R.
    Sorrentino, Jessica A.
    Wang, Kai
    Slevin, Michael K.
    Burd, Christin E.
    Liu, Jinze
    Marzluff, William F.
    Sharpless, Norman E.
    [J]. RNA, 2013, 19 (02) : 141 - 157
  • [10] Expanded Expression Landscape and Prioritization of Circular RNAs in Mammals
    Ji, Peifeng
    Wu, Wanying
    Chen, Shuai
    Zheng, Yi
    Zhou, Lin
    Zhang, Jinyang
    Cheng, Hao
    Yan, Jin
    Zhang, Shaogeng
    Yang, Penghui
    Zhao, Fangqing
    [J]. CELL REPORTS, 2019, 26 (12): : 3444 - +