Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures

被引:22
|
作者
Dong, Xueyi [1 ,2 ]
Du, Mei R. M. [1 ]
Gouil, Quentin [1 ,2 ]
Tian, Luyi [1 ,2 ,4 ]
Jabbari, Jafar S. [1 ,2 ]
Bowden, Rory [1 ,2 ]
Baldoni, Pedro L. [1 ,2 ]
Chen, Yunshun [1 ,2 ]
Smyth, Gordon K. [1 ,3 ]
Amarasinghe, Shanika L. [1 ,2 ,5 ]
Law, Charity W. [1 ,2 ]
Ritchie, Matthew E. [1 ,2 ]
机构
[1] Walter & Eliza Hall Inst Med Res, Parkville, Vic, Australia
[2] Univ Melbourne, Dept Med Biol, Parkville, Vic, Australia
[3] Univ Melbourne, Sch Math & Stat, Parkville, Vic, Australia
[4] Guangzhou Natl Lab, Guangzhou, Peoples R China
[5] Monash Univ, Australian Regenerat Med Inst, Clayton, Vic, Australia
基金
英国医学研究理事会;
关键词
QUALITY-CONTROL; R PACKAGE; QUANTIFICATION;
D O I
10.1038/s41592-023-02026-3
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The lack of benchmark data sets with inbuilt ground-truth makes it challenging to compare the performance of existing long-read isoform detection and differential expression analysis workflows. Here, we present a benchmark experiment using two human lung adenocarcinoma cell lines that were each profiled in triplicate together with synthetic, spliced, spike-in RNAs (sequins). Samples were deeply sequenced on both Illumina short-read and Oxford Nanopore Technologies long-read platforms. Alongside the ground-truth available via the sequins, we created in silico mixture samples to allow performance assessment in the absence of true positives or true negatives. Our results show that StringTie2 and bambu outperformed other tools from the six isoform detection tools tested, DESeq2, edgeR and limma-voom were best among the five differential transcript expression tools tested and there was no clear front-runner for performing differential transcript usage analysis between the five tools compared, which suggests further methods development is needed for this application.
引用
收藏
页码:1810 / 1821
页数:18
相关论文
共 50 条
  • [21] Endometrial microbiome analysis using long-read nanopore sequencing technology
    Oberle, A.
    Urban, L.
    Hengstschlager, M.
    Feichtinger, W.
    Feichtinger, M.
    HUMAN REPRODUCTION, 2019, 34 : 83 - 84
  • [22] Benchmarking long-read genome sequence alignment tools for human genomics applications
    LoTempio, Jonathan
    Delot, Emmanuele
    Vilain, Eric
    PEERJ, 2023, 11
  • [23] Comprehensive Analysis of Congenital Adrenal Hyperplasia Using Long-Read Sequencing
    Liu, Yingdi
    Chen, Miaomiao
    Liu, Jing
    Mao, Aiping
    Teng, Yanling
    Yan, Huiming
    Zhu, Huimin
    Li, Zhuo
    Liang, Desheng
    Wu, Lingqian
    CLINICAL CHEMISTRY, 2022, 68 (07) : 927 - 939
  • [24] CLN3 transcript complexity revealed by long-read RNA sequencing analysis
    Zhang, Hao-Yu
    Minnis, Christopher
    Gustavsson, Emil
    Ryten, Mina
    Mole, Sara E.
    BMC MEDICAL GENOMICS, 2024, 17 (01)
  • [25] Opportunities and challenges in long-read sequencing data analysis
    Shanika L. Amarasinghe
    Shian Su
    Xueyi Dong
    Luke Zappia
    Matthew E. Ritchie
    Quentin Gouil
    Genome Biology, 21
  • [26] Opportunities and challenges in long-read sequencing data analysis
    Amarasinghe, Shanika L.
    Su, Shian
    Dong, Xueyi
    Zappia, Luke
    Ritchie, Matthew E.
    Gouil, Quentin
    GENOME BIOLOGY, 2020, 21 (01)
  • [27] Deciphering Neurodegenerative Diseases Using Long-Read Sequencing
    Su, Yun
    Fan, Liyuan
    Shi, Changhe
    Wang, Tai
    Zheng, Huimin
    Luo, Haiyang
    Zhang, Shuo
    Hu, Zhengwei
    Fan, Yu
    Dong, Yali
    Yang, Jing
    Mao, Chengyuan
    Xu, Yuming
    NEUROLOGY, 2021, 97 (09) : 423 - 433
  • [28] Long-read RNA-sequencing identifies novel protein coding transcripts of genes with implications for inherited and complex cardiac disease
    Dore, Rhys
    Garcia-Ruiz, Sonia
    Gustavsson, Emil
    Ryten, Mina
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 58 - 58
  • [29] dsRID: in silico identification of dsRNA regions using long-read RNA-seq data
    Yamamoto, Ryo
    Liu, Zhiheng
    Choudhury, Mudra
    Xiao, Xinshu
    BIOINFORMATICS, 2023, 39 (11)
  • [30] The variables on RNA molecules: concert or cacophony? Answers in long-read sequencing
    Careen Foord
    Justine Hsu
    Julien Jarroux
    Wen Hu
    Natan Belchikov
    Shaun Pollard
    Yi He
    Anoushka Joglekar
    Hagen U. Tilgner
    Nature Methods, 2023, 20 : 20 - 24