共 56 条
On the effective depth of viral sequence data
被引:28
作者:
Illingworth, Christopher J. R.
[1
,2
]
Roy, Sunando
[3
]
Beale, Mathew A.
[4
]
Tutill, Helena
[3
]
Williams, Rachel
[3
]
Breuer, Judith
[3
]
机构:
[1] Univ Cambridge, Dept Genet, Cambridge, England
[2] Univ Cambridge, Ctr Math Sci, Dept Appl Maths & Theoret Phys, Cambridge, England
[3] UCL, Div Infect & Immun, London, England
[4] Wellcome Trust Sanger Inst, Cambridge, England
关键词:
population genetics;
sequence data;
evolutionary modelling;
DIVERSITY;
EVOLUTION;
ERRORS;
VIRUS;
TRANSMISSION;
POPULATIONS;
ADAPTATION;
TISSUES;
D O I:
10.1093/ve/vex030
中图分类号:
Q93 [微生物学];
学科分类号:
071005 ;
100705 ;
摘要:
Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity and nature of the sample collected, and the subsequent steps in viral processing. To investigate this phenomenon, we sequenced replica datasets spanning a range of viruses, and in which the point at which samples were split was different in each case, from a dataset in which independent samples were collected from a single patient to another in which all processing steps up to sequencing were applied to a single sample before splitting the sample and sequencing each replicate. We conclude that neither a high read depth nor a high template number in a sample guarantee the precision of a dataset. Measures of consistency calculated from within a single biological sample may also be insufficient; distortion of the composition of a population by the experimental procedure or genuine within-host diversity between samples may each affect the results. Where it is possible, data from replicate samples should be collected to validate the consistency of short-read sequence data.
引用
收藏
页数:9
相关论文