Bayesian-multiplicative treatment of count zeros in compositional data sets

被引:179
作者
Martin-Fernandez, Josep-Antoni [1 ]
Hron, Karel [2 ,3 ]
Templ, Matthias [3 ,4 ]
Filzmoser, Peter [3 ,4 ]
Palarea-Albaladejo, Javier [5 ]
机构
[1] Univ Girona, Dept Comp Sci Appl Math & Stat, E-17071 Girona, Spain
[2] Palacky Univ, Fac Sci, Dept Math Anal & Applicat Math, Olomouc, Czech Republic
[3] Palacky Univ, Fac Sci, Dept Geoinformat, Olomouc, Czech Republic
[4] Vienna Univ Technol, Dept Stat & Probabil Theory, Vienna, Austria
[5] Biomath & Stat Scotland, Edinburgh, Midlothian, Scotland
关键词
Dirichlet distribution; discrete composition; log-ratio transformations; posterior estimate; zero replacement;
D O I
10.1177/1471082X14535524
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Compositional count data are discrete vectors representing the numbers of outcomes falling into any of several mutually exclusive categories. Compositional techniques based on the log-ratio methodology are appropriate in those cases where the total sum of the vector elements is not of interest. Such compositional count data sets can contain zero values which are often the result of insufficiently large samples. That is, they refer to unobserved positive values that may have been observed with a larger number of trials or with a different sampling design. Because the log-ratio transformations require data with positive values, any statistical analysis of count compositions must be preceded by a proper replacement of the zeros. A Bayesian-multiplicative treatment has been proposed for addressing this count zero problem in several case studies. This treatment involves the Dirichlet prior distribution as the conjugate distribution of the multinomial distribution and a multiplicative modification of the non-zero values. Different parameterizations of the prior distribution provide different zero replacement results, whose coherence with the vector space structure of the simplex is stated. Their performance is evaluated from both the theoretical and the computational point of view.
引用
收藏
页码:134 / 158
页数:25
相关论文
共 10 条
[1]   Compositional data analysis: Where are we and where should we be heading? [J].
Aitchison, J ;
Egozcue, JJ .
MATHEMATICAL GEOLOGY, 2005, 37 (07) :829-850
[2]   Biplots of compositional data [J].
Aitchison, J ;
Greenacre, M .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2002, 51 :375-392
[3]  
Anderson T.W., 1986, STAT ANAL DATA, V2nd, DOI DOI 10.1007/978-94-009-4109-0
[4]  
Croome R, 2011, TECHNICAL REPORT
[5]   Groups of parts and their balances in compositional data analysis [J].
Egozcue, JJ ;
Pawlowsky-Glahn, V .
MATHEMATICAL GEOLOGY, 2005, 37 (07) :795-828
[6]   Isometric logratio transformations for compositional data analysis [J].
Egozcue, JJ ;
Pawlowsky-Glahn, V ;
Mateu-Figueras, G ;
Barceló-Vidal, C .
MATHEMATICAL GEOLOGY, 2003, 35 (03) :279-300
[7]  
Frechet Maurice, 1948, Ann. Inst. Henri Poincare, V10, P215
[8]  
Egozcue JJ, 2011, COMPOSITIONAL DATA ANALYSIS: THEORY AND APPLICATIONS, P141
[9]  
Lovell D, 2011, COMPOSITIONAL DATA ANALYSIS: THEORY AND APPLICATIONS, P193
[10]   Geometric approach to statistical analysis on the simplex [J].
Pawlowsky-Glahn, V ;
Egozcue, JJ .
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2001, 15 (05) :384-398