Reliable Phylogenetic Regressions for Multivariate Comparative Data: Illustration with the MANOVA and Application to the Effect of Diet on Mandible Morphology in Phyllostomid Bats

被引:51
作者
Clavel, Julien [1 ,2 ,3 ]
Morlon, Helene [1 ]
机构
[1] Paris Sci & Lettres PSL Res Univ, Ecole Normale Super IBENS, INSERM U1024, Ecole Normale Super,Inst Biol,CNRS,UMR 8197, 46 Rue Ulm, F-75005 Paris, France
[2] Nat Hist Museum, Life Sci Dept, Cromwell Rd, London SW7 5BD, England
[3] Univ Claude Bernard Lyon 1, Univ Lyon, Lab Ecol Hydrosyst Nat & Anthropises, UMR CNRS 5023,ENTPE, Blvd 11 Novembre 1918, F-69622 Villeurbanne, France
基金
欧洲研究理事会;
关键词
Generalized least squares; high-dimensional data sets; multivariate phylogenetic comparative methods; penalized likelihood; phenomics; phyllostomid bats; phylogenetic MANOVA; phylogenetic regression; R PACKAGE; CROSS-VALIDATION; PRINCIPAL COMPONENTS; REGULARIZED MANOVA; COVARIANCE; LIKELIHOOD; MODELS; CLASSIFICATION; INFORMATION; CONTRASTS;
D O I
10.1093/sysbio/syaa010
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Understanding what shapes species phenotypes over macroevolutionary timescales from comparative data often requires studying the relationship between phenotypes and putative explanatory factors or testing for differences in phenotypes across species groups. In phyllostomid bats for example, is mandible morphology associated to diet preferences? Performing such analyses depends upon reliable phylogenetic regression techniques and associated tests (e.g., phylogenetic Generalized Least Squares, pGLS, and phylogenetic analyses of variance and covariance, pANOVA, pANCOVA). While these tools are well established for univariate data, their multivariate counterparts are lagging behind. This is particularly true for high-dimensional phenotypic data, such as morphometric data. Here, we implement much-needed likelihood-based multivariate pGLS, pMANOVA, and pMANCOVA, and use a recently developed penalized-likelihood framework to extend their application to the difficult case when the number of traits p approaches or exceeds the number of species n. We then focus on the pMANOVA and use intensive simulations to assess the performance of the approach as p increases, under various levels of phylogenetic signal and correlations between the traits, phylogenetic structure in the predictors, and under various types of phenotypic differences across species groups. We show that our approach outperforms available alternatives under all circumstances, with greater power to detect phenotypic differences across species group when they exist, and a lower risk of improperly detecting nonexistent differences. Finally, we provide an empirical illustration of our pMANOVA on a geometric-morphometric data set describing mandible morphology in phyllostomid bats along with data on their diet preferences. Overall our results show significant differences between ecological groups. Our approach, implemented in the R package mvMORPH and illustrated in a tutorial for end-users, provides efficient multivariate phylogenetic regression tools for understanding what shapes phenotypic differences across species.
引用
收藏
页码:927 / 943
页数:17
相关论文
共 101 条
  • [91] Probabilistic principal component analysis
    Tipping, ME
    Bishop, CM
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 : 611 - 622
  • [92] Phylogenetic Factor Analysis
    Tolkoff, Max R.
    Alfaro, Michael E.
    Baele, Guy
    Lemey, Philippe
    Suchard, Marc A.
    [J]. SYSTEMATIC BIOLOGY, 2018, 67 (03) : 384 - 399
  • [93] Multivariate analysis of variance test for gene set analysis
    Tsai, Chen-An
    Chen, James J.
    [J]. BIOINFORMATICS, 2009, 25 (07) : 897 - 903
  • [94] REGULARISED MANOVA FOR HIGH-DIMENSIONAL DATA
    Ullah, Insha
    Jones, Beatrix
    [J]. AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2015, 57 (03) : 377 - 389
  • [95] Comparative Analysis of Principal Components Can be Misleading
    Uyeda, Josef C.
    Caetano, Daniel S.
    Pennell, Matthew W.
    [J]. SYSTEMATIC BIOLOGY, 2015, 64 (04) : 677 - 689
  • [96] Ridge estimation of inverse covariance matrices from high-dimensional data
    van Wieringen, Wessel N.
    Peeters, Carel F. W.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2016, 103 : 284 - 303
  • [97] Penalized normal likelihood and ridge regularization of correlation and covariance matrices
    Warton, David I.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (481) : 340 - 349
  • [98] Distance-based multivariate analyses confound location and dispersion effects
    Warton, David I.
    Wright, Stephen T.
    Wang, Yi
    [J]. METHODS IN ECOLOGY AND EVOLUTION, 2012, 3 (01): : 89 - 101
  • [99] Wilks SS, 1932, BIOMETRIKA, V24, P471
  • [100] Covariance-regularized regression and classification for high dimensional problems
    Witten, Daniela M.
    Tibshirani, Robert
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 615 - 636