From next-generation resequencing reads to a high-quality variant data set

被引：61

作者：

Pfeifer, S. P. ^{[1
,2
,3
]}

机构：

[1] Ecole Polytech Fed Lausanne, Sch Life Sci, Lausanne, Switzerland

[2] Swiss Inst Bioinformat, Lausanne, Switzerland

[3] Arizona State Univ, Sch Life Sci, Tempe, AZ 85287 USA

来源：

HEREDITY | 2017年 / 118卷 / 02期

关键词：

ACCURATE ERROR-CORRECTION; SEQUENCING DATA; CALLING PIPELINES; GENOMIC SEQUENCE; ALIGNMENT; DISCOVERY; ADAPTER; TOOL; ALGORITHMS; FRAMEWORK;

D O I：

10.1038/hdy.2016.102

中图分类号：

Q14 [生态学（生物生态学）];

学科分类号：

071012 ; 0713 ;

摘要：

Sequencing has revolutionized biology by permitting the analysis of genomic variation at an unprecedented resolution. High-throughput sequencing is fast and inexpensive, making it accessible for a wide range of research topics. However, the produced data contain subtle but complex types of errors, biases and uncertainties that impose several statistical and computational challenges to the reliable detection of variants. To tap the full potential of high-throughput sequencing, a thorough understanding of the data produced as well as the available methodologies is required. Here, I review several commonly used methods for generating and processing next-generation resequencing data, discuss the influence of errors and biases together with their resulting implications for downstream analyses and provide general guidelines and recommendations for producing high-quality single-nucleotide polymorphism data sets from raw reads by highlighting several sophisticated reference-based methods representing the current state of the art.

引用

页码：111 / 124

页数：14

共 50 条

[1] From next-generation resequencing reads to a high-quality variant data set
S P Pfeifer
Heredity, 2017, 118 : 111 - 124
[2] A SNP discovery method to assess variant allele probability from next-generation resequencing data
Shen, Yufeng
Wan, Zhengzheng
Coarfa, Cristian
Drabek, Rafal
Chen, Lei
Ostrowski, Elizabeth A.
Liu, Yue
Weinstock, George M.
Wheeler, David A.
Gibbs, Richard A.
Yu, Fuli
GENOME RESEARCH, 2010, 20 (02) : 273 - 280
[3] Alignment of Next-Generation Sequencing Reads
Reinert, Knut
Langmead, Ben
Weese, David
Evers, Dirk J.
ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, VOL 16, 2015, 16 : 133 - 151
[4] QcReads:An Adapter and Quality Trimming Tool for Next-Generation Sequencing Reads
Yunfei Ma
Haibing Xie
Xuman Han
David M.Irwin
Ya-Ping Zhang
JournalofGeneticsandGenomics, 2013, 40 (12) : 639 - 642
[5] Preparation of high-quality next-generation sequencing libraries from picogram quantities of target DNA
Parkinson, Nicholas J.
Maslau, Siarhei
Ferneyhough, Ben
Zhang, Gang
Gregory, Lorna
Buck, David
Ragoussis, Jiannis
Ponting, Chris P.
Fischer, Michael D.
GENOME RESEARCH, 2012, 22 (01) : 125 - 133
[6] Next-generation Data Hub Technology for a Data-centric Society through High-quality High-reliability Data Distribution
Mochida S.
Nagata T.
NTT Technical Review, 2021, 19 (02): : 47 - 52
[7] Genotype calling from next-generation sequencing data using haplotype information of reads
Zhi, Degui
Wu, Jihua
Liu, Nianjun
Zhang, Kui
BIOINFORMATICS, 2012, 28 (07) : 938 - 946
[8] QcReads: An Adapter and Quality Trimming Tool for Next-Generation Sequencing Reads
Ma, Yunfei
Xie, Haibing
Han, Xuman
Irwin, David M.
Zhang, Ya-Ping
JOURNAL OF GENETICS AND GENOMICS, 2013, 40 (12) : 639 - 642
[9] Whole Genome Complete Resequencing of Bacillus subtilis Natto by Combining Long Reads with High-Quality Short Reads
Kamada, Mayumi
Hase, Sumitaka
Sato, Kengo
Toyoda, Atsushi
Fujiyama, Asao
Sakakibara, Yasubumi
PLOS ONE, 2014, 9 (10):
[10] Consensus Rules in Variant Detection from Next-Generation Sequencing Data
Jia, Peilin
Li, Fei
Xia, Jufeng
Chen, Haiquan
Ji, Hongbin
Pao, William
Zhao, Zhongming
PLOS ONE, 2012, 7 (06):

← 1 2 3 4 5 →