Optimal sequencing depth design for whole genome re-sequencing in pigs

被引:39
作者
Jiang, Yifan [1 ]
Jiang, Yao [1 ]
Wang, Sheng [1 ]
Zhang, Qin [2 ]
Ding, Xiangdong [1 ]
机构
[1] China Agr Univ, Coll Anim Sci & Technol, Minist Agr, Natl Engn Lab Anim Breeding,Lab Anim Genet Breedi, Beijing 100193, Peoples R China
[2] Shandong Agr Univ, Coll Anim Sci & Technol, Shandong Prov Key Lab Anim Biotechnol & Dis Contr, Tai An 271001, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Genome coverage; Sequencing depth; Pig; Whole-genome sequencing; VARIATION DISCOVERY; GENETIC-VARIATION; QUALITY-CONTROL; COVERAGE; SELECTION; ADAPTATION; IMPUTATION; SIGNATURES; FRAMEWORK; ORIGIN;
D O I
10.1186/s12859-019-3164-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of variants is unclear, especially in pigs. We sequenced the genomes of three Yorkshire boars at an approximately 20X depth on the Illumina HiSeq X Ten platform and downloaded whole-genome sequencing data for three Duroc and three Landrace pigs with an approximately 20X depth for each individual. Then, we downsampled the deep genome data by extracting twelve different proportions of 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9 paired reads from the original bam files to mimic the sequence data of the same individuals at sequencing depths of 1.09X, 2.18X, 3.26X, 4.35X, 6.53X, 8.70X, 10.88X, 13.05X, 15.22X, 17.40X, 19.57X and 21.75X to evaluate the influence of genome coverage, the variant discovery rate and genotyping accuracy as a function of sequencing depth. In addition, SNP chip data for Yorkshire pigs were used as a validation for the comparison of single-sample calling and multisample calling algorithms. Results Our results indicated that 10X is an ideal practical depth for achieving plateau coverage and discovering accurate variants, which achieved greater than 99% genome coverage. The number of false-positive variants was increased dramatically at a depth of less than 4X, which covered 95% of the whole genome. In addition, the comparison of multi- and single-sample calling showed that multisample calling was more sensitive than single-sample calling, especially at lower depths. The number of variants discovered under multisample calling was 13-fold and 2-fold higher than that under single-sample calling at 1X and 22X, respectively. A large difference was observed when the depth was less than 4.38X. However, more false-positive variants were detected under multisample calling. Conclusions Our research will inform important study design decisions regarding whole-genome sequencing depth. Our results will be helpful for choosing the appropriate depth to achieve the same power for studies performed under limited budgets.
引用
收藏
页数:12
相关论文
共 47 条
[1]   Adaptation and possible ancient interspecies introgression in pigs identified by whole-genome sequencing [J].
Ai, Huashui ;
Fang, Xiaodong ;
Yang, Bin ;
Huang, Zhiyong ;
Chen, Hao ;
Mao, Likai ;
Zhang, Feng ;
Zhang, Lu ;
Cui, Leilei ;
He, Weiming ;
Yang, Jie ;
Yao, Xiaoming ;
Zhou, Lisheng ;
Han, Lijuan ;
Li, Jing ;
Sun, Silong ;
Xie, Xianhua ;
Lai, Boxian ;
Su, Ying ;
Lu, Yao ;
Yang, Hui ;
Huang, Tao ;
Deng, Wenjiang ;
Nielsen, Rasmus ;
Ren, Jun ;
Huang, Lusheng .
NATURE GENETICS, 2015, 47 (03) :217-+
[2]   Population history and genomic signatures for high-altitude adaptation in Tibetan pigs [J].
Ai, Huashui ;
Yang, Bin ;
Li, Jing ;
Xie, Xianhua ;
Chen, Hao ;
Ren, Jun .
BMC GENOMICS, 2014, 15
[3]   Accurate and comprehensive sequencing of personal genomes [J].
Ajay, Subramanian S. ;
Parker, Stephen C. J. ;
Abaan, Hatice Ozel ;
Fajardo, Karin V. Fuentes ;
Margulies, Elliott H. .
GENOME RESEARCH, 2011, 21 (09) :1498-1505
[4]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[5]   Sensitivity to sequencing depth in single-cell cancer genomics [J].
Alves, Joao M. ;
Posada, David .
GENOME MEDICINE, 2018, 10
[6]   Evaluation of variant identification methods for whole genome sequencing data in dairy cattle [J].
Baes, Christine F. ;
Dolezal, Marlies A. ;
Koltes, James E. ;
Bapst, Beat ;
Fritz-Waters, Eric ;
Jansen, Sandra ;
Flury, Christine ;
Signer-Hasler, Heidi ;
Stricker, Christian ;
Fernando, Rohan ;
Fries, Ruedi ;
Moll, Juerg ;
Garrick, Dorian J. ;
Reecy, James M. ;
Gredler, Birgit .
BMC GENOMICS, 2014, 15
[7]   Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities [J].
Bainbridge, Matthew N. ;
Wang, Min ;
Wu, Yuanqing ;
Newsham, Irene ;
Muzny, Donna M. ;
Jefferies, John L. ;
Albert, Thomas J. ;
Burgess, Daniel L. ;
Gibbs, Richard A. .
GENOME BIOLOGY, 2011, 12 (07)
[8]   Genotyping by sequencing of rice interspecific backcross inbred lines identifies QTLs for grain weight and grain length [J].
Bhatia, Dharminder ;
Wing, Rod A. ;
Yu, Yeisoo ;
Chougule, Kapeel ;
Kudrna, Dave ;
Lee, Seunghee ;
Rang, Allah ;
Singh, Kuldeep .
EUPHYTICA, 2018, 214 (02)
[9]   Variant calling in low-coverage whole genome sequencing of a Native American population sample [J].
Bizon, Chris ;
Spiegel, Michael ;
Chasse, Scott A. ;
Gizer, Ian R. ;
Li, Yun ;
Malc, Ewa P. ;
Mieczkowski, Piotr A. ;
Sailsbery, Josh K. ;
Wang, Xiaoshu ;
Ehlers, Cindy L. ;
Wilhelmsen, Kirk C. .
BMC GENOMICS, 2014, 15
[10]   Second-generation PLINK: rising to the challenge of larger and richer datasets [J].
Chang, Christopher C. ;
Chow, Carson C. ;
Tellier, Laurent C. A. M. ;
Vattikuti, Shashaank ;
Purcell, Shaun M. ;
Lee, James J. .
GIGASCIENCE, 2015, 4