Recommendations for Accurate Resolution of Gene and Isoform Allele-Specific Expression in RNA-Seq Data

被引:13
|
作者
Wood, David L. A. [1 ]
Nones, Katia [1 ]
Steptoe, Anita [1 ]
Christ, Angelika [1 ]
Harliwong, Ivon [1 ]
Newell, Felicity [1 ]
Bruxner, Timothy J. C. [1 ]
Miller, David [1 ]
Cloonan, Nicole [2 ]
Grimmond, Sean M. [1 ,3 ]
机构
[1] Univ Queensland, Queensland Ctr Med Genom, Brisbane, Qld, Australia
[2] QIMR Berghofer Med Res Inst, Herston, Qld 4006, Australia
[3] Univ Glasgow, Translat Res Ctr, Glasgow, Lanark, Scotland
来源
PLOS ONE | 2015年 / 10卷 / 05期
基金
澳大利亚研究理事会;
关键词
HUMAN GENOME; TRANSCRIPTOME; HUMANS; METHYLATION; IMBALANCE; SEQUENCE; DISEASE; READS; RISK;
D O I
10.1371/journal.pone.0126911
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genetic variation modulates gene expression transcriptionally or post-transcriptionally, and can profoundly alter an individual's phenotype. Measuring allelic differential expression at heterozygous loci within an individual, a phenomenon called allele-specific expression (ASE), can assist in identifying such factors. Massively parallel DNA and RNA sequencing and advances in bioinformatic methodologies provide an outstanding opportunity to measure ASE genome-wide. In this study, matched DNA and RNA sequencing, genotyping arrays and computationally phased haplotypes were integrated to comprehensively and conservatively quantify ASE in a single human brain and liver tissue sample. We describe a methodological evaluation and assessment of common bioinformatic steps for ASE quantification, and recommend a robust approach to accurately measure SNP, gene and isoform ASE through the use of personalized haplotype genome alignment, strict alignment quality control and intragenic SNP aggregation. Our results indicate that accurate ASE quantification requires careful bioinformatic analyses and is adversely affected by sample specific alignment confounders and random sampling even at moderate sequence depths. We identified multiple known and several novel ASE genes in liver, including WDR72, DSP and UBD, as well as genes that contained ASE SNPs with imbalance direction discordant with haplotype phase, explainable by annotated transcript structure, suggesting isoform derived ASE. The methods evaluated in this study will be of use to researchers performing highly conservative quantification of ASE, and the genes and isoforms identified as ASE of interest to researchers studying those loci.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] Estimation of alternative splicing isoform frequencies from RNA-Seq data
    Nicolae, Marius
    Mangul, Serghei
    Mandoiu, Ion I.
    Zelikovsky, Alex
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2011, 6
  • [22] Efficient RNA isoform identification and quantification from RNA-Seq data with network flows
    Bernard, Elsa
    Jacob, Laurent
    Mairal, Julien
    Vert, Jean-Philippe
    BIOINFORMATICS, 2014, 30 (17) : 2447 - 2455
  • [23] RNA-Seq Analyses Identify Frequent Allele Specific Expression and No Evidence of Genomic Imprinting in Specific Embryonic Tissues of Chicken
    Zhuo, Zhu
    Lamont, Susan J.
    Abasht, Behnam
    SCIENTIFIC REPORTS, 2017, 7
  • [24] GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences
    Cumbie, Jason S.
    Kimbrel, Jeffrey A.
    Di, Yanming
    Schafer, Daniel W.
    Wilhelm, Larry J.
    Fox, Samuel E.
    Sullivan, Christopher M.
    Curzon, Aron D.
    Carrington, James C.
    Mockler, Todd C.
    Chang, Jeff H.
    PLOS ONE, 2011, 6 (10):
  • [25] A Comparative Study of Techniques for Differential Expression Analysis on RNA-Seq Data
    Zhang, Zong Hong
    Jhaveri, Dhanisha J.
    Marshall, Vikki M.
    Bauer, Denis C.
    Edson, Janette
    Narayanan, Ramesh K.
    Robinson, Gregory J.
    Lundberg, Andreas E.
    Bartlett, Perry F.
    Wray, Naomi R.
    Zhao, Qiong-Yi
    PLOS ONE, 2014, 9 (08):
  • [26] Limited allele-specific gene expression in highly polyploid sugarcane
    Alves Margarido, Gabriel Rodrigues
    Correr, Fernando Henrique
    Furtado, Agnelo
    Botha, Frederik C.
    Henry, Robert James
    GENOME RESEARCH, 2022, 32 (02) : 297 - 308
  • [27] Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq
    Wu, Zhengpeng
    Wang, Xi
    Zhang, Xuegong
    BIOINFORMATICS, 2011, 27 (04) : 502 - 508
  • [28] Estimating the strength of expression conservation from high throughput RNA-seq data
    Gu, Xun
    Ruan, Hang
    Yang, Jingwen
    BIOINFORMATICS, 2019, 35 (23) : 5030 - 5038
  • [29] Systematic comparison and assessment of RNA-seq procedures for gene expression quantitative analysis
    Corchete, Luis A.
    Rojas, Elizabeta A.
    Alonso-Lopez, Diego
    De Las Rivas, Javier
    Gutierrez, Norma C.
    Burguillo, Francisco J.
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [30] SNP development from RNA-seq data in a nonmodel fish: how many individuals are needed for accurate allele frequency prediction?
    Schunter, C.
    Garza, J. C.
    Macpherson, E.
    Pascual, M.
    MOLECULAR ECOLOGY RESOURCES, 2014, 14 (01) : 157 - 165