Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data

被引:9
作者
Nguyen, Dat Thanh [1 ]
Trac, Quang Thinh [1 ]
Nguyen, Thi-Hau [2 ]
Nguyen, Ha-Nam [3 ]
Ohad, Nir [4 ]
Pawitan, Yudi [1 ]
Vu, Trung Nghia [1 ]
机构
[1] Karolinska Inst, Dept Med Epidemiol & Biostat, Stockholm, Sweden
[2] Vietnam Natl Univ Hanoi, Univ Engn & Technol, Hanoi, Vietnam
[3] Vietnam Natl Univ Hanoi, Informat Technol Inst, Hanoi, Vietnam
[4] Tel Aviv Univ, Sch Plant Sci & Food Secur, Tel Aviv, Israel
基金
瑞典研究理事会;
关键词
READ ALIGNMENT; REVEALS; BIOGENESIS;
D O I
10.1186/s12859-021-04418-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Circular RNA (circRNA) is an emerging class of RNA molecules attracting researchers due to its potential for serving as markers for diagnosis, prognosis, or therapeutic targets of cancer, cardiovascular, and autoimmune diseases. Current methods for detection of circRNA from RNA sequencing (RNA-seq) focus mostly on improving mapping quality of reads supporting the back-splicing junction (BSJ) of a circRNA to eliminate false positives (FPs). We show that mapping information alone often cannot predict if a BSJ-supporting read is derived from a true circRNA or not, thus increasing the rate of FP circRNAs. Results We have developed Circall, a novel circRNA detection method from RNA-seq. Circall controls the FPs using a robust multidimensional local false discovery rate method based on the length and expression of circRNAs. It is computationally highly efficient by using a quasi-mapping algorithm for fast and accurate RNA read alignments. We applied Circall on two simulated datasets and three experimental datasets of human cell-lines. The results show that Circall achieves high sensitivity and precision in the simulated data. In the experimental datasets it performs well against current leading methods. Circall is also substantially faster than the other methods, particularly for large datasets. Conclusions With those better performances in the detection of circRNAs and in computational time, Circall facilitates the analyses of circRNAs in large numbers of samples. Circall is implemented in C++ and R, and available for use at https://www.meb.ki.se/sites/biostatwiki/circall and https://github.com/datngu/Circall.
引用
收藏
页数:18
相关论文
共 42 条
  • [1] circRNA Biogenesis Competes with Pre-mRNA Splicing
    Ashwal-Fluss, Reut
    Meyer, Markus
    Pamudurti, Nagarjuna Reddy
    Ivanov, Andranik
    Bartok, Osnat
    Hanan, Mor
    Evantal, Naveh
    Memczak, Sebastian
    Rajewsky, Nikolaus
    Kadener, Sebastian
    [J]. MOLECULAR CELL, 2014, 56 (01) : 55 - 66
  • [2] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [3] Analysis of canonical and non-canonical splice sites in mammalian genomes
    Burset, M
    Seledtsov, IA
    Solovyev, VV
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (21) : 4364 - 4375
  • [4] Biogenesis, identification, and function of exonic circular RNAs
    Chen, Iju
    Chen, Chia-Ying
    Chuang, Trees-Juen
    [J]. WILEY INTERDISCIPLINARY REVIEWS-RNA, 2015, 6 (05) : 563 - 579
  • [5] Integrative transcriptome sequencing reveals extensive alternative trans-splicing and cis-backsplicing in human cells
    Chuang, Trees-Juen
    Chen, Yen-Ju
    Chen, Chia-Ying
    Mai, Te-Lun
    Wang, Yi-Da
    Yeh, Chung-Shu
    Yang, Min-Yu
    Hsiao, Yu-Ting
    Chang, Tien-Hsien
    Kuo, Tzu-Chien
    Cho, Hsin-Hua
    Shen, Chia-Ning
    Kuo, Hung-Chih
    Lu, Mei-Yeh
    Chen, Yi-Hua
    Hsieh, Shan-Chi
    Chiang, Tai-Wei
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (07) : 3671 - 3691
  • [6] STAR: ultrafast universal RNA-seq aligner
    Dobin, Alexander
    Davis, Carrie A.
    Schlesinger, Felix
    Drenkow, Jorg
    Zaleski, Chris
    Jha, Sonali
    Batut, Philippe
    Chaisson, Mark
    Gingeras, Thomas R.
    [J]. BIOINFORMATICS, 2013, 29 (01) : 15 - 21
  • [7] Polyester: simulating RNA-seq datasets with differential transcript expression
    Frazee, Alyssa C.
    Jaffe, Andrew E.
    Langmead, Ben
    Leek, Jeffrey T.
    [J]. BIOINFORMATICS, 2015, 31 (17) : 2778 - 2784
  • [8] Computational Strategies for Exploring Circular RNAs
    Gao, Yuan
    Zhao, Fangqing
    [J]. TRENDS IN GENETICS, 2018, 34 (05) : 389 - 400
  • [9] Circular RNA identification based on multiple seed matching
    Gao, Yuan
    Zhang, Jinyang
    Zhao, Fangqing
    [J]. BRIEFINGS IN BIOINFORMATICS, 2018, 19 (05) : 803 - 810
  • [10] Comprehensive identification of internal structure and alternative splicing events in circular RNAs
    Gao, Yuan
    Wang, Jinfeng
    Zheng, Yi
    Zhang, Jinyang
    Chen, Shuai
    Zhao, Fangqing
    [J]. NATURE COMMUNICATIONS, 2016, 7