Finite Mixture of Regression Modeling for High-Dimensional Count and Biomass Data in Ecology

被引:53
作者
Dunstan, Piers K. [1 ]
Foster, Scott D. [2 ]
Hui, Francis K. C. [3 ]
Warton, David I. [3 ]
机构
[1] CSIRO Marine & Atmospher Res, Hobart, Tas 7001, Australia
[2] CSIRO Math Informat & Stat, Hobart, Tas 7001, Australia
[3] Univ New S Wales, Sch Math & Stat, Sydney, NSW 2052, Australia
基金
澳大利亚研究理事会;
关键词
Community-level model; Mixture model; Multi-species; Species archetype model; Species distribution model; Tweedie; MAXIMUM-LIKELIHOOD; BETA DIVERSITY;
D O I
10.1007/s13253-013-0146-x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Understanding how species distributions respond as a function of environmental gradients is a key question in ecology, and will benefit from a multi-species approach. Multi-species data are often high dimensional, in that the number of species sampled is often large relative to the number of sites, and are commonly quantified as either presence-absence, counts of individuals, or biomass of each species. In this paper, we propose a novel approach to the analysis of multi-species data when the goal is to understand how each species responds to their environment. We use a finite mixture of regression models, grouping species into "Archetypes" according to their environmental response, thereby significantly reducing the dimension of the regression model. Previous research introduced such Species Archetype Models (SAMs), but only for binary assemblage data. Here, we extend this basic framework with three key innovations: (1) the method is expanded to handle count and biomass data, (2) we propose grouping on the slope coefficients only, whilst the intercept terms and nuisance parameters remain species-specific, and (3) we develop model diagnostic tools for SAMs. By grouping on environmental responses only, the model allows for inter-species variation in terms of overall prevalence and abundance. The application of our expanded SAM framework data is illustrated on marine survey data and through simulation. Supplementary materials accompanying this paper appear on-line.
引用
收藏
页码:357 / 375
页数:19
相关论文
共 38 条
[1]   A hybrid EM/Gauss-Newton algorithm for maximum likelihood in mixture distributions [J].
Aitkin, M ;
Aitkin, I .
STATISTICS AND COMPUTING, 1996, 6 (02) :127-130
[2]   Navigating the multiple meanings of β diversity: a roadmap for the practicing ecologist [J].
Anderson, Marti J. ;
Crist, Thomas O. ;
Chase, Jonathan M. ;
Vellend, Mark ;
Inouye, Brian D. ;
Freestone, Amy L. ;
Sanders, Nathan J. ;
Cornell, Howard V. ;
Comita, Liza S. ;
Davies, Kendi F. ;
Harrison, Susan P. ;
Kraft, Nathan J. B. ;
Stegen, James C. ;
Swenson, Nathan G. .
ECOLOGY LETTERS, 2011, 14 (01) :19-28
[3]  
[Anonymous], 2000, Sankhya Ser. A, DOI DOI 10.2307/25051289
[4]  
[Anonymous], 1983, Generalized Linear Models
[5]  
Bax N, 2000, HABITAT FISHERIES PR
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]   Mapping ocean properties in regions of complex topography [J].
Dunn, JR ;
Ridgway, KR .
DEEP-SEA RESEARCH PART I-OCEANOGRAPHIC RESEARCH PAPERS, 2002, 49 (03) :591-604
[8]  
Dunn P., 1996, J COMPUT GRAPH STAT, V5, P236, DOI DOI 10.2307/1390802
[9]   Series evaluation of Tweedie exponential dispersion model densities [J].
Dunn, PK ;
Gordon, KS .
STATISTICS AND COMPUTING, 2005, 15 (04) :267-280
[10]   Model based grouping of species across environmental gradients [J].
Dunstan, Piers K. ;
Foster, Scott D. ;
Darnell, Ross .
ECOLOGICAL MODELLING, 2011, 222 (04) :955-963