Aitchison's Compositional Data Analysis 40 Years on: A Reappraisal

被引:17
作者
Greenacre, Michael [1 ,2 ]
Grunsky, Eric [3 ]
Bacon-Shone, John [4 ]
Erb, Ionas [5 ]
Quinn, Thomas [6 ]
机构
[1] Univ Pompeu Fabra, Dept Econ & Business, Barcelona, Spain
[2] Barcelona Sch Management, Barcelona, Spain
[3] Univ Waterloo, Dept Earth & Environm Sci, Waterloo, ON, Canada
[4] Univ Hong Kong, Fac Social Sci, Hong Kong, Peoples R China
[5] Barcelona Inst Sci & Technol, Ctr Genom Regulat CRG, Barcelona, Spain
[6] Deakin Univ, Appl Artificial Intelligence Inst A2I2, Geelong, Australia
关键词
Box-Cox transformation; compositional mod-eling; correspondence analysis; isometry; logratio transformations; log-contrast; principal component analysis; Procrustes analysis; subcompositional coherence; BIOLOGICAL-ACTIVITY PROFILES; STATISTICAL-ANALYSIS; FATTY-ACIDS; SELECTION; VISUALIZATION; REGRESSION; MODELS; PARTS;
D O I
10.1214/22-STS880
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The development of John Aitchison's approach to compositional data analysis is followed since his paper read to the Royal Statistical Society in 1982. Aitchison's logratio approach, which was proposed to solve the problematic aspects of working with data with a fixed-sum constraint, is sum-marized and reappraised. It is maintained that the properties on which this approach was originally built, the main one being subcompositional coherence, are not required to be satisfied exactly-quasi-coherence is sufficient, that is near enough to being coherent for all practical purposes. This opens up the field to using simpler data transformations, such as power transformations, that permit zero values in the data. The additional property of exact isometry, which was subsequently introduced and not in Aitchison's original conception, imposed the use of isometric logratio transformations, but these are complicated and problematic to interpret, involving ratios of geometric means. If this property is regarded as important in certain analytical contexts, for example, unsupervised learning, it can be relaxed by showing that regular pairwise logratios, as well as the alternative quasi-coherent transformations, can also be quasi-isometric, meaning they are close enough to exact isometry for all practical purposes. It is concluded that the isometric and related lo-gratio transformations such as pivot logratios are not a prerequisite for good practice, although many authors insist on their obligatory use. This conclu-sion is fully supported here by case studies in geochemistry and in genomics, where the good performance is demonstrated of pairwise logratios, as orig-inally proposed by Aitchison, or Box-Cox power transforms of the original compositions where no zero replacements are necessary.
引用
收藏
页码:386 / 410
页数:25
相关论文
共 93 条
[1]  
AITCHISON J, 1984, BIOMETRIKA, V71, P323
[2]  
AITCHISON J, 1982, J ROY STAT SOC B, V44, P139
[3]   Biplots of compositional data [J].
Aitchison, J ;
Greenacre, M .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2002, 51 :375-392
[4]  
Aitchison J., 2008, P CODAWORK 08 3 COMP
[5]  
Amari SI, 2016, APPL MATH SCI, V194, P1, DOI 10.1007/978-4-431-55978-8
[6]  
[Anonymous], 1997, P IAMG 97THE 3 ANN C
[7]  
Benzecri J.L., 1980, ANAL DONNEES 2 ANAL
[8]  
BONA M., 2006, WALK COMBINATORICS I, V2nd, DOI [10.1142/6177, DOI 10.1142/6177]
[9]  
Boogaart K., 2013, Analyzing Compositional Data with R, P209, DOI [10.1007/978-3-642-36809-77, DOI 10.1007/978-3-642-36809-77]
[10]   The FOREGS repository: Modelling variability in stream water on a continental scale revising classical diagrams from CoDA (compositional data analysis) perspective [J].
Buccianti, Antonella .
JOURNAL OF GEOCHEMICAL EXPLORATION, 2015, 154 :94-104