Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data

被引:9
作者
Nguyen, Dat Thanh [1 ]
Trac, Quang Thinh [1 ]
Nguyen, Thi-Hau [2 ]
Nguyen, Ha-Nam [3 ]
Ohad, Nir [4 ]
Pawitan, Yudi [1 ]
Vu, Trung Nghia [1 ]
机构
[1] Karolinska Inst, Dept Med Epidemiol & Biostat, Stockholm, Sweden
[2] Vietnam Natl Univ Hanoi, Univ Engn & Technol, Hanoi, Vietnam
[3] Vietnam Natl Univ Hanoi, Informat Technol Inst, Hanoi, Vietnam
[4] Tel Aviv Univ, Sch Plant Sci & Food Secur, Tel Aviv, Israel
基金
瑞典研究理事会;
关键词
READ ALIGNMENT; REVEALS; BIOGENESIS;
D O I
10.1186/s12859-021-04418-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Circular RNA (circRNA) is an emerging class of RNA molecules attracting researchers due to its potential for serving as markers for diagnosis, prognosis, or therapeutic targets of cancer, cardiovascular, and autoimmune diseases. Current methods for detection of circRNA from RNA sequencing (RNA-seq) focus mostly on improving mapping quality of reads supporting the back-splicing junction (BSJ) of a circRNA to eliminate false positives (FPs). We show that mapping information alone often cannot predict if a BSJ-supporting read is derived from a true circRNA or not, thus increasing the rate of FP circRNAs. Results We have developed Circall, a novel circRNA detection method from RNA-seq. Circall controls the FPs using a robust multidimensional local false discovery rate method based on the length and expression of circRNAs. It is computationally highly efficient by using a quasi-mapping algorithm for fast and accurate RNA read alignments. We applied Circall on two simulated datasets and three experimental datasets of human cell-lines. The results show that Circall achieves high sensitivity and precision in the simulated data. In the experimental datasets it performs well against current leading methods. Circall is also substantially faster than the other methods, particularly for large datasets. Conclusions With those better performances in the detection of circRNAs and in computational time, Circall facilitates the analyses of circRNAs in large numbers of samples. Circall is implemented in C++ and R, and available for use at https://www.meb.ki.se/sites/biostatwiki/circall and https://github.com/datngu/Circall.
引用
收藏
页数:18
相关论文
共 42 条
  • [21] Circular RNAs: diversity of form and function
    Lasda, Erika
    Parker, Roy
    [J]. RNA, 2014, 20 (12) : 1829 - 1842
  • [22] Li H., 2013, Aligning sequence reads, clone sequences and assembly contigs with BWAMEM, DOI DOI 10.48550/ARXIV.1303.3997
  • [23] Fast and accurate short read alignment with Burrows-Wheeler transform
    Li, Heng
    Durbin, Richard
    [J]. BIOINFORMATICS, 2009, 25 (14) : 1754 - 1760
  • [24] Circular RNAs are a large class of animal RNAs with regulatory potency
    Memczak, Sebastian
    Jens, Marvin
    Elefsinioti, Antigoni
    Torti, Francesca
    Krueger, Janna
    Rybak, Agnieszka
    Maier, Luisa
    Mackowiak, Sebastian D.
    Gregersen, Lea H.
    Munschauer, Mathias
    Loewer, Alexander
    Ziebold, Ulrike
    Landthaler, Markus
    Kocks, Christine
    le Noble, Ferdinand
    Rajewsky, Nikolaus
    [J]. NATURE, 2013, 495 (7441) : 333 - 338
  • [25] Circular RNA: an emerging key player in RNA world
    Meng, Xianwen
    Li, Xue
    Zhang, Peijing
    Wang, Jingjing
    Zhou, Yincong
    Chen, Ming
    [J]. BRIEFINGS IN BIOINFORMATICS, 2017, 18 (04) : 547 - 557
  • [26] Translation of CircRNAs
    Pamudurti, Nagarjuna Reddy
    Bartok, Osnat
    Jens, Marvin
    Ashwal-Fluss, Reut
    Stottmeister, Christin
    Ruhe, Larissa
    Hanan, Mor
    Wyler, Emanuel
    Perez-Hernandez, Daniel
    Ramberger, Evelyn
    Shenzis, Shlomo
    Samson, Moshe
    Dittmar, Gunnar
    Landthaler, Markus
    Chekulaeva, Marina
    Rajewsky, Nikolaus
    Kadener, Sebastian
    [J]. MOLECULAR CELL, 2017, 66 (01) : 9 - +
  • [27] Salmon provides fast and bias-aware quantification of transcript expression
    Patro, Rob
    Duggal, Geet
    Love, Michael I.
    Irizarry, Rafael A.
    Kingsford, Carl
    [J]. NATURE METHODS, 2017, 14 (04) : 417 - +
  • [28] False discovery rate, sensitivity and sample size for microarray studies
    Pawitan, Y
    Michiels, S
    Koscielny, S
    Gusnanto, A
    Ploner, A
    [J]. BIOINFORMATICS, 2005, 21 (13) : 3017 - 3024
  • [29] Multidimensional local false discovery rate for microarray studies
    Ploner, A
    Calza, S
    Gusnanto, A
    Pawitan, Y
    [J]. BIOINFORMATICS, 2006, 22 (05) : 556 - 565
  • [30] Next-generation genome annotation: we still struggle to get it right
    Salzberg, Steven L.
    [J]. GENOME BIOLOGY, 2019, 20 (1)