Recommendations for Accurate Resolution of Gene and Isoform Allele-Specific Expression in RNA-Seq Data

被引:13
|
作者
Wood, David L. A. [1 ]
Nones, Katia [1 ]
Steptoe, Anita [1 ]
Christ, Angelika [1 ]
Harliwong, Ivon [1 ]
Newell, Felicity [1 ]
Bruxner, Timothy J. C. [1 ]
Miller, David [1 ]
Cloonan, Nicole [2 ]
Grimmond, Sean M. [1 ,3 ]
机构
[1] Univ Queensland, Queensland Ctr Med Genom, Brisbane, Qld, Australia
[2] QIMR Berghofer Med Res Inst, Herston, Qld 4006, Australia
[3] Univ Glasgow, Translat Res Ctr, Glasgow, Lanark, Scotland
来源
PLOS ONE | 2015年 / 10卷 / 05期
基金
澳大利亚研究理事会;
关键词
HUMAN GENOME; TRANSCRIPTOME; HUMANS; METHYLATION; IMBALANCE; SEQUENCE; DISEASE; READS; RISK;
D O I
10.1371/journal.pone.0126911
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genetic variation modulates gene expression transcriptionally or post-transcriptionally, and can profoundly alter an individual's phenotype. Measuring allelic differential expression at heterozygous loci within an individual, a phenomenon called allele-specific expression (ASE), can assist in identifying such factors. Massively parallel DNA and RNA sequencing and advances in bioinformatic methodologies provide an outstanding opportunity to measure ASE genome-wide. In this study, matched DNA and RNA sequencing, genotyping arrays and computationally phased haplotypes were integrated to comprehensively and conservatively quantify ASE in a single human brain and liver tissue sample. We describe a methodological evaluation and assessment of common bioinformatic steps for ASE quantification, and recommend a robust approach to accurately measure SNP, gene and isoform ASE through the use of personalized haplotype genome alignment, strict alignment quality control and intragenic SNP aggregation. Our results indicate that accurate ASE quantification requires careful bioinformatic analyses and is adversely affected by sample specific alignment confounders and random sampling even at moderate sequence depths. We identified multiple known and several novel ASE genes in liver, including WDR72, DSP and UBD, as well as genes that contained ASE SNPs with imbalance direction discordant with haplotype phase, explainable by annotated transcript structure, suggesting isoform derived ASE. The methods evaluated in this study will be of use to researchers performing highly conservative quantification of ASE, and the genes and isoforms identified as ASE of interest to researchers studying those loci.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] Finding the active genes in deep RNA-seq gene expression studies
    Hart, Traver
    Komori, H. Kiyomi
    LaMere, Sarah
    Podshivalova, Katie
    Salomon, Daniel R.
    BMC GENOMICS, 2013, 14
  • [42] Trimming of sequence reads alters RNA-Seq gene expression estimates
    Williams, Claire R.
    Baccarella, Alyssa
    Parrish, Jay Z.
    Kim, Charles C.
    BMC BIOINFORMATICS, 2016, 17
  • [43] Impact of human gene annotations on RNA-seq differential expression analysis
    Hamaguchi, Yu
    Zeng, Chao
    Hamada, Michiaki
    BMC GENOMICS, 2021, 22 (01)
  • [44] Recurrent functional misinterpretation of RNA-seq data caused by sample-specific gene length bias
    Mandelboum, Shir
    Manber, Zohar
    Elroy-Stein, Orna
    Elkon, Ran
    PLOS BIOLOGY, 2019, 17 (11)
  • [45] Quantitative visualization of alternative exon expression from RNA-seq data
    Katz, Yarden
    Wang, Eric T.
    Silterra, Jacob
    Schwartz, Schraga
    Wong, Bang
    Thorvaldsdottir, Helga
    Robinson, James T.
    Mesirov, Jill P.
    Airoldi, Edoardo M.
    Burge, Christopher B.
    BIOINFORMATICS, 2015, 31 (14) : 2400 - 2402
  • [46] A scaling normalization method for differential expression analysis of RNA-seq data
    Robinson, Mark D.
    Oshlack, Alicia
    GENOME BIOLOGY, 2010, 11 (03):
  • [47] Gene expression variability in mammalian embryonic stem cells using single cell RNA-seq data
    Mantsoki, Anna
    Devailly, Guillaume
    Joshi, Anagha
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2016, 63 : 52 - 61
  • [48] Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data
    Kanitz, Alexander
    Gypas, Foivos
    Gruber, Andreas J.
    Gruber, Andreas R.
    Martin, Georges
    Zavolan, Mihaela
    GENOME BIOLOGY, 2015, 16
  • [49] Automated Isoform Diversity Detector (AIDD): a pipeline for investigating transcriptome diversity of RNA-seq data
    Noel-Marie Plonski
    Emily Johnson
    Madeline Frederick
    Heather Mercer
    Gail Fraizer
    Richard Meindl
    Gemma Casadesus
    Helen Piontkivska
    BMC Bioinformatics, 21
  • [50] Co-Expression Networks for Causal Gene Identification Based on RNA-Seq Data ofCorynebacterium pseudotuberculosis
    Franco, Edian F.
    Rana, Pratip
    Queiroz Cavalcante, Ana Lidia
    da Silva, Artur Luiz
    Pinto Gomide, Anne Cybelle
    Carneiro Folador, Adriana R.
    Azevedo, Vasco
    Ghosh, Preetam
    Ramos, Rommel T. J.
    GENES, 2020, 11 (07) : 1 - 17