Sequence assembly demystified

被引:283
作者
Nagarajan, Niranjan [1 ]
Pop, Mihai [2 ]
机构
[1] Genome Inst Singapore, Singapore 138672, Singapore
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
基金
美国国家科学基金会;
关键词
DE-BRUIJN GRAPHS; RNA-SEQ DATA; QUASI-SPECIES RECONSTRUCTION; GENOME ASSEMBLIES; STRUCTURAL VARIATION; SINGLE-CELL; SHORT READS; BACTERIAL GENOMES; DRAFT ASSEMBLIES; RESTRICTION MAPS;
D O I
10.1038/nrg3367
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Advances in sequencing technologies and increased access to sequencing services have led to renewed interest in sequence and genome assembly. Concurrently, new applications for sequencing have emerged, including gene expression analysis, discovery of genomic variants and metagenomics, and each of these has different needs and challenges in terms of assembly. We survey the theoretical foundations that underlie modern assembly and highlight the options and practical trade-offs that need to be considered, focusing on how individual features address the needs of specific applications. We also review key software and the interplay between experimental design and efficacy of assembly.
引用
收藏
页码:157 / 167
页数:11
相关论文
共 96 条
  • [1] HapCompass: A Fast Cycle Basis Algorithm for Accurate Haplotype Assembly of Sequence Data
    Aguiar, Derek
    Istrail, Sorin
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (06) : 577 - 590
  • [2] Limitations of next-generation genome sequence assembly
    Alkan, Can
    Sajjadian, Saba
    Eichler, Evan E.
    [J]. NATURE METHODS, 2011, 8 (01) : 61 - 65
  • [3] [Anonymous], GENOME SCI TECHNOL, DOI DOI 10.1089/GST.1995.1.9
  • [4] A new approach to sequence comparison:: normalired sequence alignment
    Arslan, AN
    Egecioglu, Ö
    Pevzner, PA
    [J]. BIOINFORMATICS, 2001, 17 (04) : 327 - 337
  • [5] Inferring viral quasispecies spectra from 454 pyrosequencing reads
    Astrovskaya, Irina
    Tork, Bassam
    Mangul, Serghei
    Westbrooks, Kelly
    Mandoiu, Ion
    Balfe, Peter
    Zelikovsky, Alex
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [6] SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
    Bankevich, Anton
    Nurk, Sergey
    Antipov, Dmitry
    Gurevich, Alexey A.
    Dvorkin, Mikhail
    Kulikov, Alexander S.
    Lesin, Valery M.
    Nikolenko, Sergey I.
    Son Pham
    Prjibelski, Andrey D.
    Pyshkin, Alexey V.
    Sirotkin, Alexander V.
    Vyahhi, Nikolay
    Tesler, Glenn
    Alekseyev, Max A.
    Pevzner, Pavel A.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) : 455 - 477
  • [7] HapCUT: an efficient and accurate algorithm for the haplotype assembly problem
    Bansal, Vikas
    Bafna, Vineet
    [J]. BIOINFORMATICS, 2008, 24 (16) : I153 - I159
  • [8] Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes
    Barthelson, Roger
    McFarlin, Adam J.
    Rounsley, Steven D.
    Young, Sarah
    [J]. PLOS ONE, 2011, 6 (12):
  • [9] Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance
    Bashir, Ali
    Bansal, Vikas
    Bafna, Vineet
    [J]. BMC GENOMICS, 2010, 11
  • [10] De novo transcriptome assembly with ABySS
    Birol, Inanc
    Jackman, Shaun D.
    Nielsen, Cydney B.
    Qian, Jenny Q.
    Varhol, Richard
    Stazyk, Greg
    Morin, Ryan D.
    Zhao, Yongjun
    Hirst, Martin
    Schein, Jacqueline E.
    Horsman, Doug E.
    Connors, Joseph M.
    Gascoyne, Randy D.
    Marra, Marco A.
    Jones, Steven J. M.
    [J]. BIOINFORMATICS, 2009, 25 (21) : 2872 - 2877