Univariate statistical analysis of environmental (compositional) data: Problems and possibilities

被引:325
|
作者
Filzmoser, Peter [1 ]
Hron, Karel [2 ]
Reimann, Clemens [3 ]
机构
[1] Vienna Univ Technol, Inst Stat & Probabil Theory, A-1040 Vienna, Austria
[2] Palacky Univ, Fac Sci, Dept Math Anal & Applicat Math, CZ-77100 Olomouc, Czech Republic
[3] Geol Survey Norway, N-7491 Trondheim, Norway
关键词
Compositional data; Closure problem; Univariate statistical analysis; Exploratory data analysis; Log transformation; TRANSFORMATIONS;
D O I
10.1016/j.scitotenv.2009.08.008
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
For almost 30 years it has been known that compositional (closed) data have special geometrical properties. in environmental sciences, where the concentration of chemical elements in different sample materials is investigated, almost all datasets are compositional. In general, compositional data are parts of a whole which only give relative information. Data that sum up to a constant, e.g. 100wt.%, 1,000,000 mg/kg are the best known example. It is widely neglected that the "closure" characteristic remains even if only one of all possible elements is measured, it is an inherent property of compositional data. No variable is free to vary independent of all the others. Existing transformations to "open" closed data are seldom applied. They are more complicated than a log transformation and the relationship to the original data unit is lost. Results obtained when using classical statistical techniques for data analysis appeared reasonable and the possible consequences of working with closed data were rarely questioned. Here the simple univariate case of data analysis is investigated. It can be demonstrated that data closure must be overcome prior to calculating even simple statistical measures like mean or standard deviation or plotting graphs of the data distribution, e.g. a histogram. Some measures like the standard deviation (or the variance) make no statistical sense with closed data and all statistical tests building on the standard deviation (or variance) will thus provide erroneous results if used with the original data. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:6100 / 6108
页数:9
相关论文
共 50 条
  • [11] Statistical Analysis and Interpolation of Compositional Data in Materials Science
    Pesenson, Misha Z.
    Suram, Santosh K.
    Gregoire, John M.
    ACS COMBINATORIAL SCIENCE, 2015, 17 (02) : 130 - 136
  • [12] PROBLEMS AND POSSIBILITIES OF STATISTICAL DOCUMENTATION
    FERSCHL, F
    METRIKA, 1969, 14 (2-3) : 277 - 292
  • [13] Evaluation of Metabolomics Data Using Univariate and Multivariate Statistical Analysis Techniques
    Moroz, J.
    Fallone, G.
    Syme, A.
    Allalunis-Turner, J.
    MEDICAL PHYSICS, 2010, 37 (06) : 3471 - +
  • [14] Statistical Analysis of COVID-19 Data: Using A New Univariate and Bivariate Statistical Model
    Bantan, Rashad A. R.
    Shafiq, Shakaiba
    Tahir, M. H.
    Elhassanein, Ahmed
    Jamal, Farrukh
    Almutiry, Waleed
    Elgarhy, Mohammed
    JOURNAL OF FUNCTION SPACES, 2022, 2022
  • [15] Compositional data analysis in geo-environmental sciences
    Egozcue, J. J.
    Pawlowsky-Glahn, V.
    BOLETIN GEOLOGICO Y MINERO, 2011, 122 (04): : 439 - 452
  • [16] Multivariate statistical analysis in problems of environmental simulation
    Serdiutskaya, L.F.
    Kameneva, I.P.
    Engineering Simulation, 2000, 17 (02): : 193 - 204
  • [17] Multivariate statistical analysis of environmental data
    Brzezinska, Justyna
    Rybicka, Aneta
    Pelka, Marcin
    12TH PROFESSOR ALEKSANDER ZELIAS INTERNATIONAL CONFERENCE ON MODELLING AND FORECASTING OF SOCIO-ECONOMIC PHENOMENA, 2018, 1 : 40 - 49
  • [18] DATA PROBLEMS AND STATISTICAL-ANALYSIS
    GUTTMAN, NB
    11TH CONFERENCE ON PROBABILITY AND STATISTICS IN ATMOSPHERIC SCIENCES, 1989, : 1 - 5
  • [19] Compositional data analysis enables statistical rigor in comparative glycomics
    Bennett, Alexander R.
    Lundstrom, Jon
    Chatterjee, Sayantani
    Thaysen-Andersen, Morten
    Bojar, Daniel
    NATURE COMMUNICATIONS, 2025, 16 (01)
  • [20] STATISTICAL-ANALYSIS OF CHEMICAL COMPOSITIONAL DATA AND THE COMPARISON OF ANALYSES
    BAXTER, MJ
    ARCHAEOMETRY, 1992, 34 : 267 - 277