Accounting for bias from sequencing error in population genetic estimates

被引:71
作者
Johnson, Philip L. F. [1 ]
Slatkin, Montgomery [2 ]
机构
[1] Univ Calif Berkeley, Biophys Grad Grp, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
关键词
sequencing error; population genetics; bias; quality score;
D O I
10.1093/molbev/msm239
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Sequencing error presents a significant challenge to population genetic analyses using low-coverage sequence in general and single-pass reads in particular. Bias in parameter estimates becomes severe when the level of polymorphism (signal) is low relative to the amount of error (noise). Choosing an arbitrary quality score cutoff yields biased estimates, particularly with newer, non-Sanger sequencing technologies that have different quality score distributions. We propose a rule of thumb to judge when a given threshold will lead to significant bias and suggest alternative approaches that reduce bias.
引用
收藏
页码:199 / 206
页数:8
相关论文
共 29 条
  • [1] Genome dynamics in a natural archaeal population
    Allen, Eric E.
    Tyson, Gene W.
    Whitaker, Rachel J.
    Detter, John C.
    Richardson, Paul M.
    Banfield, Jillian F.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (06) : 1883 - 1888
  • [2] Analysis of the quality and utility of random shotgun sequencing at low redundancies
    Bouck, J
    Miller, W
    Gorrell, JH
    Muzny, D
    Gibbs, RA
    [J]. GENOME RESEARCH, 1998, 8 (10) : 1074 - 1084
  • [3] Phantom mutation hotspots in human mitochondrial DNA
    Brandstätter, A
    Sänger, T
    Lutz-Bonengel, S
    Parson, W
    Béraud-Colomb, E
    Wen, B
    Kong, QP
    Bravi, CM
    Bandelt, HJ
    [J]. ELECTROPHORESIS, 2005, 26 (18) : 3414 - 3429
  • [4] Patterns of damage in genomic DNA sequences from a Neandertal
    Briggs, Adrian W.
    Stenzel, Udo
    Johnson, Philip L. F.
    Green, Richard E.
    Kelso, Janet
    Pruefer, Kay
    Meyer, Matthias
    Krause, Johannes
    Ronan, Michael T.
    Lachmann, Michael
    Paeaebo, Svante
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (37) : 14616 - 14621
  • [5] Nucleotide diversity and linkage disequilibrium in loblolly pine
    Brown, GR
    Gill, GP
    Kuntz, RJ
    Langley, CH
    Neale, DB
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (42) : 15255 - 15260
  • [6] CLARK AG, 1992, MOL BIOL EVOL, V9, P744
  • [7] CORRELATIONS, DESCENT MEASURES - DRIFT WITH MIGRATION AND MUTATION
    COCKERHAM, CC
    WEIR, BS
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (23) : 8512 - 8514
  • [8] The patterns of natural variation in human genes
    Crawford, DC
    Akey, DT
    Nickerson, DA
    [J]. ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2005, 6 : 287 - 312
  • [9] Base-calling of automated sequencer traces using phred.: II.: Error probabilities
    Ewing, B
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 186 - 194
  • [10] Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 Mb of mouse genome
    Frazer, KA
    Wade, CM
    Hinds, DA
    Patil, N
    Cox, DR
    Daly, MJ
    [J]. GENOME RESEARCH, 2004, 14 (08) : 1493 - 1500