Comparative assessment of methods for aligning multiple genome sequences

被引:27
作者
Chen, Xiaoyu [1 ]
Tompa, Martin [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Dept Genome Sci, Seattle, WA 98195 USA
基金
美国国家卫生研究院; 加拿大自然科学与工程研究理事会;
关键词
NONCODING SEQUENCES; ALIGNMENT; UNCERTAINTY; VERTEBRATE; CONSTRAINT; DISCOVERY; THOUSANDS;
D O I
10.1038/nbt.1637
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Multiple sequence alignment is a difficult computational problem. There have been compelling pleas for methods to assess whole-genome multiple sequence alignments and compare the alignments produced by different tools. We assess the four ENCODE alignments, each of which aligns 28 vertebrates on 554 Mbp of total input sequence. We measure the level of agreement among the alignments and compare their coverage and accuracy. We find a disturbing lack of agreement among the alignments not only in species distant from human, but even in mouse, a well-studied model organism. Overall, the assessment shows that Pecan produces the most accurate or nearly most accurate alignment in all species and genomic location categories, while still providing coverage comparable to or better than that of the other alignments in the placental mammals. Our assessment reveals that constructing accurate whole-genome multiple sequence alignments remains a significant challenge, particularly for noncoding regions and distantly related species.
引用
收藏
页码:567 / U53
页数:8
相关论文
共 40 条
  • [11] APPLICATIONS AND STATISTICS FOR MULTIPLE HIGH-SCORING SEGMENTS IN MOLECULAR SEQUENCES
    KARLIN, S
    ALTSCHUL, SF
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (12) : 5873 - 5877
  • [12] The human genome browser at UCSC
    Kent, WJ
    Sugnet, CW
    Furey, TS
    Roskin, KM
    Pringle, TH
    Zahler, AM
    Haussler, D
    [J]. GENOME RESEARCH, 2002, 12 (06) : 996 - 1006
  • [13] Adaptive evolution of conserved noncoding elements in mammals
    Kim, Su Yeon
    Pritchard, Jonathan K.
    [J]. PLOS GENETICS, 2007, 3 (09): : 1572 - 1586
  • [14] Multiple sequence alignment: In pursuit of homologous DNA positions
    Kumar, Sudhir
    Filipski, Alan
    [J]. GENOME RESEARCH, 2007, 17 (02) : 127 - 135
  • [15] Uncertainty in homology inferences: Assessing and improving genomic sequence alignment
    Lunter, Gerton
    Rocco, Andrea
    Mimouni, Naila
    Heger, Andreas
    Caldeira, Alexandre
    Hein, Jotun
    [J]. GENOME RESEARCH, 2008, 18 (02) : 298 - 309
  • [16] Identification and characterization of multi-species conserved sequences
    Margulies, EH
    Blanchette, M
    Haussler, D
    Green, ED
    [J]. GENOME RESEARCH, 2003, 13 (12) : 2507 - 2518
  • [17] Approaches to comparative sequence analysis: towards a functional view of vertebrate genomes
    Margulies, Elliott H.
    Birney, Ewan
    [J]. NATURE REVIEWS GENETICS, 2008, 9 (04) : 303 - 313
  • [18] Confidence in comparative genomics
    Margulies, Elliott H.
    [J]. GENOME RESEARCH, 2008, 18 (02) : 199 - 200
  • [19] Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome
    Margulies, Elliott H.
    Cooper, Gregory M.
    Asimenos, George
    Thomas, Daryl J.
    Dewey, Colin N.
    Siepel, Adam
    Birney, Ewan
    Keefe, Damian
    Schwartz, Ariel S.
    Hou, Minmei
    Taylor, James
    Nikolaev, Sergey
    Montoya-Burgos, Juan I.
    Loytynoja, Ari
    Whelan, Simon
    Pardi, Fabio
    Massingham, Tim
    Brown, James B.
    Bickel, Peter
    Holmes, Ian
    Mullikin, James C.
    Ureta-Vidal, Abel
    Paten, Benedict
    Stone, Eric A.
    Rosenbloom, Kate R.
    Kent, W. James
    Antonarakis, Stylianos E.
    Batzoglou, Serafim
    Goldman, Nick
    Hardison, Ross
    Haussler, David
    Miller, Webb
    Pachter, Lior
    Green, Eric D.
    Sidow, Arend
    [J]. GENOME RESEARCH, 2007, 17 (06) : 760 - 774
  • [20] Resolution of the early placental mammal radiation using Bayesian phylogenetics
    Murphy, WJ
    Eizirik, E
    O'Brien, SJ
    Madsen, O
    Scally, M
    Douady, CJ
    Teeling, E
    Ryder, OA
    Stanhope, MJ
    de Jong, WW
    Springer, MS
    [J]. SCIENCE, 2001, 294 (5550) : 2348 - 2351