Sensitivity, specificity, and reproducibility of RNA-Seq differential expression calls

被引:25
|
作者
Labaj, Pawel P. [1 ,2 ]
Kreil, David P. [2 ]
机构
[1] Austrian Acad Sci, Vienna, Austria
[2] Boku Univ, Bioinformat Res Grp, Vienna, Austria
关键词
RNA-seq; Sensitivity; Specificity; Reproducibility; Differential expression calling; GENE; PACKAGE;
D O I
10.1186/s13062-016-0169-7
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The MAQC/SEQC consortium has recently compiled a key benchmark that can serve for testing the latest developments in analysis tools for microarray and RNA-seq expression profiling. Such objective benchmarks are required for basic and applied research, and can be critical for clinical and regulatory outcomes. Going beyond the first comparisons presented in the original SEQC study, we here present extended benchmarks including effect strengths typical of common experiments. Results: With artefacts removed by factor analysis and additional filters, for genome scale surveys, the reproducibility of differential expression calls typically exceed 80% for all tool combinations examined. This directly reflects the robustness of results and reproducibility across different studies. Similar improvements are observed for the top ranked candidates with the strongest relative expression change, although here some tools clearly perform better than others, with typical reproducibility ranging from 60 to 93%. Conclusions: In our benchmark of alternative tools for RNA-seq data analysis we demonstrated the benefits that can be gained by analysing results in the context of other experiments employing a reference standard sample. This allowed the computational identification and removal of hidden confounders, for instance, by factor analysis. In itself, this already substantially improved the empirical False Discovery Rate (eFDR) without changing the overall landscape of sensitivity. Further filtering of false positives, however, is required to obtain acceptable eFDR levels. Appropriate filters noticeably improved agreement of differentially expressed genes both across sites and between alternative differential expression analysis pipelines.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Sensitivity, specificity, and reproducibility of RNA-Seq differential expression calls
    Paweł P. Łabaj
    David P. Kreil
    Biology Direct, 11
  • [2] Differential expression in RNA-seq: A matter of depth
    Tarazona, Sonia
    Garcia-Alcalde, Fernando
    Dopazo, Joaquin
    Ferrer, Alberto
    Conesa, Ana
    GENOME RESEARCH, 2011, 21 (12) : 2213 - 2223
  • [3] Power analysis for RNA-Seq differential expression studies
    Yu, Lianbo
    Fernandez, Soledad
    Brock, Guy
    BMC BIOINFORMATICS, 2017, 18
  • [4] Robustness of differential gene expression analysis of RNA-seq
    Stupnikov, A.
    McInerney, C. E.
    Savage, K. I.
    McIntosh, S. A.
    Emmert-Streib, F.
    Kennedy, R.
    Salto-Tellez, M.
    Prise, K. M.
    McArt, D. G.
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 3470 - 3481
  • [5] Detecting differential expression from RNA-seq data with expression measurement uncertainty
    Zhang, Li
    Chen, Songcan
    Liu, Xuejun
    FRONTIERS OF COMPUTER SCIENCE, 2015, 9 (04) : 652 - 663
  • [6] A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data
    Wu, Hao
    Wang, Chi
    Wu, Zhijin
    BIOSTATISTICS, 2013, 14 (02) : 232 - 243
  • [7] Power analysis for RNA-Seq differential expression studies
    Lianbo Yu
    Soledad Fernandez
    Guy Brock
    BMC Bioinformatics, 18
  • [8] Identifying differential expression for RNA-seq data with no replication
    Gim, Jungsoo
    Park, Taesung
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [9] From RNA-seq reads to differential expression results
    Oshlack, Alicia
    Robinson, Mark D.
    Young, Matthew D.
    GENOME BIOLOGY, 2010, 11 (12):
  • [10] Differential gene expression analysis using coexpression and RNA-Seq data
    Yang, Ei-Wen
    Girke, Thomas
    Jiang, Tao
    BIOINFORMATICS, 2013, 29 (17) : 2153 - 2161