A computationally efficient approach to estimating species richness and rarefaction curve

被引:4
作者
Baek, Seungchul [1 ]
Park, Junyong [2 ]
机构
[1] Univ Maryland Baltimore Cty, Dept Math & Stat, Baltimore, MD USA
[2] Seoul Natl Univ, Dept Stat, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Nonparametric empirical Bayes; Quadratic optimization; Rarefaction curve; Species richness; MAXIMUM-LIKELIHOOD-ESTIMATION; ACCUMULATION CURVE; NONPARAMETRIC MLE; NUMBER; DIVERSITY; MODELS; REGRESSION; SAMPLE;
D O I
10.1007/s00180-021-01185-1
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In ecological and educational studies, estimators of the total number of species and rarefaction curve based on empirical samples are important tools. We propose a new method to estimate both rarefaction curve and the number of species based on a ready-made numerical approach such as quadratic optimization. The key idea in developing the proposed algorithm is based on nonparametric empirical Bayes estimation incorporating an interpolated rarefaction curve through quadratic optimization with linear constraints based on g-modeling in Efron (Stat Sci 29:285-301, 2014). Our proposed algorithm is easily implemented and shows better performances than existing methods in terms of computational speed and accuracy. Furthermore, we provide a criterion of model selection to choose some tuning parameters in estimation procedure and the idea of confidence interval based on asymptotic theory rather than resampling method. We present some asymptotic result of our estimator to validate the efficiency of our estimator theoretically. A broad range of numerical studies including simulations and real data examples are also conducted, and the gain that it produces has been compared to existing methods.
引用
收藏
页码:1919 / 1941
页数:23
相关论文
共 40 条
[1]   Fine-scale phylogenetic architecture of a complex bacterial community [J].
Acinas, SG ;
Klepac-Ceraj, V ;
Hunt, DE ;
Pharino, C ;
Ceraj, I ;
Distel, DL ;
Polz, MF .
NATURE, 2004, 430 (6999) :551-554
[2]  
[Anonymous], 1968, A Complete and Systematic Concordance to the Works of Shakespeare
[3]  
B?hning D., 1999, COMPUTER ASSISTED AN
[4]  
Baayen R.H., 2002, WORD FREQUENCY DISTR
[5]   Objective Bayesian Estimation for the Number of Species [J].
Barger, Kathryn ;
Bunge, John .
BAYESIAN ANALYSIS, 2010, 5 (04) :765-785
[6]   Concentration inequalities in the infinite urn scheme for occupancy counts and the missing mass, with applications [J].
Ben-Hamou, Anna ;
Boucheron, Stephane ;
Ohannessian, Mesrob I. .
BERNOULLI, 2017, 23 (01) :249-287
[8]   A NOTE ON A TEST FOR POISSON OVERDISPERSION [J].
BOHNING, D .
BIOMETRIKA, 1994, 81 (02) :418-419
[9]   Nonparametric maximum likelihood estimation of population size based on the counting distribution [J].
Böhning, D ;
Schön, D .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2005, 54 :721-737
[10]   ESTIMATING THE NUMBER OF SPECIES - A REVIEW [J].
BUNGE, J ;
FITZPATRICK, M .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :364-373