Alignment uncertainty and genomic analysis

被引:273
作者
Wong, Karen M. [2 ]
Suchard, Marc A. [3 ]
Huelsenbeck, John P. [1 ]
机构
[1] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
[2] Univ Calif San Diego, Sect Ecol Behav & Evolut, La Jolla, CA 92093 USA
[3] Univ Calif Los Angeles, Dept Biomath, Los Angeles, CA 90095 USA
关键词
D O I
10.1126/science.1151532
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The statistical methods applied to the analysis of genomic data do not account for uncertainty in the sequence alignment. Indeed, the alignment is treated as an observation, and all of the subsequent inferences depend on the alignment being correct. This may not have been too problematic for many phylogenetic studies, in which the gene is carefully chosen for, among other things, ease of alignment. However, in a comparative genomics study, the same statistical methods are applied repeatedly on thousands of genes, many of which will be difficult to align. Using genomic data from seven yeast species, we show that uncertainty in the alignment can lead to several problems, including different alignment methods resulting in different conclusions.
引用
收藏
页码:473 / 476
页数:4
相关论文
共 23 条
[1]   Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios [J].
Clark, AG ;
Glanowski, S ;
Nielsen, R ;
Thomas, PD ;
Kejariwal, A ;
Todd, MA ;
Tanenbaum, DM ;
Civello, D ;
Lu, F ;
Murphy, B ;
Ferriera, S ;
Wang, G ;
Zheng, XG ;
White, TJ ;
Sninsky, JJ ;
Adams, MD ;
Cargill, M .
SCIENCE, 2003, 302 (5652) :1960-1963
[2]   Finding functional features in Saccharomyces genomes by phylogenetic footprinting [J].
Cliften, P ;
Sudarsanam, P ;
Desikan, A ;
Fulton, L ;
Fulton, B ;
Majors, J ;
Waterston, R ;
Cohen, BA ;
Johnston, M .
SCIENCE, 2003, 301 (5629) :71-76
[3]   Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis [J].
Cliften, PF ;
Hillier, LW ;
Fulton, L ;
Graves, T ;
Miner, T ;
Gish, WR ;
Waterston, RH ;
Johnston, M .
GENOME RESEARCH, 2001, 11 (07) :1175-1186
[4]   ProbCons: Probabilistic consistency-based multiple sequence alignment [J].
Do, CB ;
Mahabhashyam, MSP ;
Brudno, M ;
Batzoglou, S .
GENOME RESEARCH, 2005, 15 (02) :330-340
[5]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[6]   EVALUATING THE PHYLOGENETIC UTILITY OF GENES - A SEARCH FOR GENES INFORMATIVE ABOUT DEEP DIVERGENCES AMONG VERTEBRATES [J].
GRAYBEAL, A .
SYSTEMATIC BIOLOGY, 1994, 43 (02) :174-193
[7]   Recursions for statistical multiple alignment [J].
Hein, J ;
Jensen, JL ;
Pedersen, CNS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (25) :14960-14965
[8]   Evolutionary HMMs: a Bayesian approach to multiple alignment [J].
Holmes, I ;
Bruno, WJ .
BIOINFORMATICS, 2001, 17 (09) :803-820
[9]   MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform [J].
Katoh, K ;
Misawa, K ;
Kuma, K ;
Miyata, T .
NUCLEIC ACIDS RESEARCH, 2002, 30 (14) :3059-3066
[10]   Sequencing and comparison of yeast species to identify genes and regulatory elements [J].
Kellis, M ;
Patterson, N ;
Endrizzi, M ;
Birren, B ;
Lander, ES .
NATURE, 2003, 423 (6937) :241-254