Phylotranscriptomics: Saturated Third Codon Positions Radically Influence the Estimation of Trees Based on Next-Gen Data

被引:92
作者
Breinholt, Jesse W. [1 ]
Kawahara, Akito Y. [1 ]
机构
[1] Univ Florida, Florida Museum Nat Hist, Gainesville, FL 32611 USA
来源
GENOME BIOLOGY AND EVOLUTION | 2013年 / 5卷 / 11期
基金
美国国家科学基金会;
关键词
Bombycoidea; Lepidoptera; phylogeny; saturation; synonymous substitutions; transcriptome; SITE RATE VARIATION; MAXIMUM-LIKELIHOOD MODELS; ENRICHMENT STRATEGIES; PHYLOGENETIC SIGNAL; SOFTWARE PACKAGE; ADVANCED MOTHS; MIXTURE MODEL; MISSING DATA; LEPIDOPTERA; SEQUENCE;
D O I
10.1093/gbe/evt157
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advancements in molecular sequencing techniques have led to a surge in the number of phylogenetic studies that incorporate large amounts of genetic data. We test the assumption that analyzing large number of genes will lead to improvements in tree resolution and branch support using moths in the superfamily Bombycoidea, a group with some interfamilial relationships that have been difficult to resolve. Specifically, we use a next-gen data set that included 19 taxa and 938 genes (similar to 1.2M bp) to examine how codon position and saturation might influence resolution and node support among three key families. Maximum likelihood, parsimony, and species tree analysis using gene tree parsimony, on different nucleotide and amino acid data sets, resulted in largely congruent topologies with high bootstrap support compared with prior studies that included fewer loci. However, for a few shallow nodes, nucleotide and amino acid data provided high support for conflicting relationships. The third codon position was saturated and phylogenetic analysis of this position alone supported a completely different, potentially misleading sister group relationship. We used the program RADICAL to assess the number of genes needed to fix some of these difficult nodes. One such node originally needed a total of 850 genes but only required 250 when synonymous signal was removed. Our study shows that, in order to effectively use next-gen data to correctly resolve difficult phylogenetic relationships, it is necessary to assess the effects of synonymous substitutions and third codon positions.
引用
收藏
页码:2082 / 2092
页数:11
相关论文
共 77 条
  • [1] TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations
    Abascal, Federico
    Zardoya, Rafael
    Telford, Maximilian J.
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : W7 - W13
  • [2] [Anonymous], GEN V5 5 8
  • [3] Benson DA, 2013, NUCLEIC ACIDS RES, V41, pD36, DOI [10.1093/nar/gkn723, 10.1093/nar/gkp1024, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkl986, 10.1093/nar/gkq1079, 10.1093/nar/gks1195, 10.1093/nar/gkg057]
  • [4] Addressing Gene Tree Discordance and Non-Stationarity to Resolve a Multi-Locus Phylogeny of the Flatfishes (Teleostei: Pleuronectiformes)
    Betancur-R., Ricardo
    Li, Chenhong
    Munroe, Thomas A.
    Ballesteros, Jesus A.
    Orti, Guillermo
    [J]. SYSTEMATIC BIOLOGY, 2013, 62 (05) : 763 - 785
  • [5] Exploring among-site rate variation models in a maximum likelihood framework using empirical data: Effects of model assumptions on estimates of topology, branch lengths, and bootstrap support
    Buckley, TR
    Simon, C
    Chambers, GK
    [J]. SYSTEMATIC BIOLOGY, 2001, 50 (01) : 67 - 86
  • [6] Directed next generation sequencing for phylogenetics: An example using Decapoda (Crustacea)
    Bybee, Seth M.
    Bracken-Grissom, Heather D.
    Hermansen, Russell A.
    Clement, Mark J.
    Crandall, Keith A.
    Felder, Darryl L.
    [J]. ZOOLOGISCHER ANZEIGER, 2011, 250 (04): : 497 - 506
  • [7] Characterization of the Complete Mitochondrial Genomes of Cnaphalocrocis medinalis and Chilo suppressalis (Lepidoptera: Pyralidae)
    Chai, Huan-Na
    Du, Yu-Zhou
    Zhai, Bao-Ping
    [J]. INTERNATIONAL JOURNAL OF BIOLOGICAL SCIENCES, 2012, 8 (04): : 561 - 579
  • [8] Chaudhary R, 2010, BMC BIOINFORMATICS, V11, DOI 10.1186/1471-2105-11-574
  • [9] A phylogeny of cycads (Cycadales) inferred from chloroplast matK gene, trnK intron, and nuclear rDNA ITS region
    Chaw, SM
    Walters, TW
    Chang, CC
    Hu, SH
    Chen, SH
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2005, 37 (01) : 214 - 234
  • [10] Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
    Cho, Soowon
    Zwick, Andreas
    Regier, Jerome C.
    Mitter, Charles
    Cummings, Michael P.
    Yao, Jianxiu
    Du, Zaile
    Zhao, Hong
    Kawahara, Akito Y.
    Weller, Susan
    Davis, Donald R.
    Baixeras, Joaquin
    Brown, John W.
    Parr, Cynthia
    [J]. SYSTEMATIC BIOLOGY, 2011, 60 (06) : 782 - 796