Impact of human gene annotations on RNA-seq differential expression analysis

被引:6
|
作者
Hamaguchi, Yu [1 ]
Zeng, Chao [1 ,2 ]
Hamada, Michiaki [1 ,2 ,3 ,4 ]
机构
[1] Waseda Univ, Fac Sci & Engn, Shinjuku Ku, 55N-06-10,3-4-1 Okubo, Tokyo 1698555, Japan
[2] Waseda Univ, AIST, Computat Bio Big Data Open Innovat Lab CBBD OIL, Shinjuku Ku, 3-4-1 Okubo, Tokyo 1698555, Japan
[3] Waseda Univ, Inst Med Oriented Struct Biol, Shinjuku Ku, 2-2 Wakamatsu Cho, Tokyo 1628480, Japan
[4] Nippon Med Sch, Grad Sch Med, Bunkyo Ku, 1-1-5 Sendagi, Tokyo 1138602, Japan
关键词
RNA-seq; Differential expression analysis; Benchmarking; Gene annotation; QUANTIFICATION; TRANSCRIPTOME; DISCOVERY; ALIGNMENT; HISAT;
D O I
10.1186/s12864-021-08038-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated-a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear. Results Using "mappability", a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically. Conclusions We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Analysis of differential gene expression by RNA-seq data in brain areas of laboratory animals
    Babenko, Vladimir N.
    Bragin, Anatoly O.
    Spitsina, Anastasia M.
    Chadaeva, Irina V.
    Galieva, Elvira R.
    Orlova, Galina V.
    Medvedeva, Irina V.
    Orlov, Yuriy L.
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2016, 13 (04) : 292
  • [22] Differential gene expression analysis of palbociclib-resistant TNBC via RNA-seq
    Lanceta, Lilibeth
    Lypova, Nadiia
    O'Neill, Conor
    Li, Xiaohong
    Rouchka, Eric
    Chesney, Jason
    Imbert-Fernandez, Yoannis
    BREAST CANCER RESEARCH AND TREATMENT, 2021, 186 (03) : 677 - 686
  • [23] Development of a quantitative targeted RNA-Seq methodology for use in differential gene expression analysis
    Lader, Eric
    Hussong, Melanie
    Fosbrink, Matthew
    CANCER RESEARCH, 2016, 76
  • [24] Differential gene expression analysis of palbociclib-resistant TNBC via RNA-seq
    Lilibeth Lanceta
    Nadiia Lypova
    Conor O’Neill
    Xiaohong Li
    Eric Rouchka
    Jason Chesney
    Yoannis Imbert-Fernandez
    Breast Cancer Research and Treatment, 2021, 186 : 677 - 686
  • [25] Erratum to: Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data
    Franck Rapaport
    Raya Khanin
    Yupu Liang
    Mono Pirun
    Azra Krek
    Paul Zumbo
    Christopher E. Mason
    Nicholas D. Socci
    Doron Betel
    Genome Biology, 16
  • [26] Gene set enrichment analysis of RNA-Seq data: integrating differential expression and splicing
    Wang, Xi
    Cairns, Murray J.
    BMC BIOINFORMATICS, 2013, 14
  • [27] Gene set enrichment analysis of RNA-Seq data: integrating differential expression and splicing
    Xi Wang
    Murray J Cairns
    BMC Bioinformatics, 14
  • [28] Differential analysis of gene regulation at transcript resolution with RNA-seq
    Cole Trapnell
    David G Hendrickson
    Martin Sauvageau
    Loyal Goff
    John L Rinn
    Lior Pachter
    Nature Biotechnology, 2013, 31 : 46 - 53
  • [29] Differential analysis of gene regulation at transcript resolution with RNA-seq
    Trapnell, Cole
    Hendrickson, David G.
    Sauvageau, Martin
    Goff, Loyal
    Rinn, John L.
    Pachter, Lior
    NATURE BIOTECHNOLOGY, 2013, 31 (01) : 46 - +
  • [30] Stability of methods for differential expression analysis of RNA-seq data
    Bingqing Lin
    Zhen Pang
    BMC Genomics, 20