Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?

被引:81
作者
Cho, Soowon [1 ]
Zwick, Andreas [2 ]
Regier, Jerome C. [2 ]
Mitter, Charles [1 ]
Cummings, Michael P. [3 ]
Yao, Jianxiu [2 ]
Du, Zaile [2 ]
Zhao, Hong [2 ]
Kawahara, Akito Y. [1 ]
Weller, Susan [4 ]
Davis, Donald R. [5 ]
Baixeras, Joaquin [6 ]
Brown, John W. [7 ]
Parr, Cynthia
机构
[1] Univ Maryland, Dept Entomol, College Pk, MD 20742 USA
[2] Univ Maryland, Inst Biotechnol, Ctr Biosyst Res, College Pk, MD 20742 USA
[3] Univ Maryland, Lab Mol Evolut, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
[4] Univ Minnesota, Dept Entomol, St Paul, MN 55108 USA
[5] Smithsonian Inst, Dept Entomol, Washington, DC 20560 USA
[6] Univ Valencia, Cavanilles Inst Biodivers & Evolutionary Biol, Valencia, Spain
[7] ARS, Systemat Entomol Lab, USDA, Beltsville, MD 20705 USA
基金
美国国家科学基金会;
关键词
Ditrysia; gene sampling; Hexapoda; Lepidoptera; missing data; molecular phylogenetics; nuclear genes; taxon sampling; CODON-SUBSTITUTION MODELS; MISSING DATA; NUCLEAR GENE; AMINO-ACID; DATA SETS; TAXA; EVOLUTION; TREE; CHARACTERS; CONFIDENCE;
D O I
10.1093/sysbio/syr079
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deeper relationships in the large clade Ditrysia ( > 150,000 species) of the insect order Lepidoptera (butterflies and moths). Seeking to remedy the overall weak support for deeper divergences in an initial study based on five nuclear genes (6.6 kb) in 123 exemplars, we nearly tripled the total gene sample (to 26 genes, 18.4 kb) but only in a third (41) of the taxa. The resulting partially augmented data matrix (45% intentionally missing data) consistently increased bootstrap support for groupings previously identified in the five-gene (nearly) complete matrix, while introducing no contradictory groupings of the kind that missing data have been predicted to produce. Our results add to growing evidence that data sets differing substantially in gene and taxon sampling can often be safely and profitably combined. The strongest overall support for nodes above the family level came from including all nucleotide changes, while partitioning sites into sets undergoing mostly nonsynonymous versus mostly synonymous change. In contrast, support for the deepest node for which any persuasive molecular evidence has yet emerged (78-85% bootstrap) was weak or nonexistent unless synonymous change was entirely excluded, a result plausibly attributed to compositional heterogeneity. This node (Gelechioidea + Apoditrysia), tentatively proposed by previous authors on the basis of four morphological synapomorphies, is the first major subset of ditrysian superfamilies to receive strong statistical support in any phylogenetic study. A "more-genes-only" data set (41 taxax26 genes) also gave strong signal for a second deep grouping (Macrolepidoptera) that was obscured, but not strongly contradicted, in more taxon-rich analyses.
引用
收藏
页码:782 / 796
页数:15
相关论文
共 55 条
  • [1] [Anonymous], 2003, FORUM HERBULOT WORLD
  • [2] [Anonymous], 1999, GEOMETRID MOTHS WORL
  • [3] [Anonymous], 2006, GENETIC ALGORITHM AP
  • [4] Bazinet A.L., 2009, DISTRIBUTED GRID COM, P2
  • [5] Patterns of mitochondrial versus nuclear DNA sequence divergence among nymphalid butterflies: the utility of wingless as a source of characters for phylogenetic inference
    Brower, AVZ
    DeSalle, R
    [J]. INSECT MOLECULAR BIOLOGY, 1998, 7 (01) : 73 - 82
  • [6] Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms
    Burleigh, J. Gordon
    Hilu, Khidir W.
    Soltis, Douglas E.
    [J]. BMC EVOLUTIONARY BIOLOGY, 2009, 9
  • [7] Cummings M.P., 2005, Educause Review, V40, P116
  • [8] The supermatrix approach to systematics
    de Queiroz, Alan
    Gatesy, John
    [J]. TRENDS IN ECOLOGY & EVOLUTION, 2007, 22 (01) : 34 - 41
  • [9] Prospects for building the tree of life from large sequence databases
    Driskell, AC
    Ané, C
    Burleigh, JG
    McMahon, MM
    O'Meara, BC
    Sanderson, MJ
    [J]. SCIENCE, 2004, 306 (5699) : 1172 - 1174
  • [10] A new nuclear gene for insect phylogenetics: Dopa decarboxylase is informative of relationships within heliothinae (Lepidoptera: Noctuidae)
    Fang, QQ
    Cho, S
    Regier, JC
    Mitter, C
    Matthews, M
    Poole, RW
    Friedlander, TP
    Zhao, SW
    [J]. SYSTEMATIC BIOLOGY, 1997, 46 (02) : 269 - 283