Testing noisy numerical data for monotonic association

被引:17
作者
Bodenhofer, Ulrich [1 ]
Krone, Martin [2 ]
Klawonn, Frank [2 ,3 ]
机构
[1] Johannes Kepler Univ Linz, Inst Bioinformat, A-4040 Linz, Austria
[2] Ostfalia Univ Appl Sci, Dept Comp Sci, D-38302 Wolfenbuttel, Germany
[3] Helmholtz Ctr Infect Res, D-38124 Braunschweig, Germany
关键词
Gamma correlation coefficient; Rank correlation; Rank correlation test; Fuzzy ordering; Robust statistics; R package rococo; FUZZY ORDERINGS; PROOF;
D O I
10.1016/j.ins.2012.11.026
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Rank correlation measures are intended to measure to which extent there is a monotonic association between two observables. While they are mainly designed for ordinal data, they are not ideally suited for noisy numerical data. In order to better account for noisy data, a family of rank correlation measures has previously been introduced that replaces classical ordering relations by fuzzy relations with smooth transitions thereby ensuring that the correlation measure is continuous with respect to the data. The given paper briefly repeats the basic concepts behind this family of rank correlation measures and investigates it from the viewpoint of robust statistics. Then, on this basis, we introduce a framework of novel rank correlation tests. An extensive experimental evaluation using a large number of simulated data sets is presented which demonstrates that the new tests indeed outperform the classical variants in terms of type II error rates without sacrificing good performance in terms of type I error rates. This is mainly due to the fact that the new tests are more robust to noise for small samples. The Gaussian rank correlation estimator turned out to be the best choice in situations where no prior knowledge is available about the data, whereas the new family of robust gamma test provides an advantage in situations where information about the noise distribution is available. An implementation of all robust rank correlation tests used in this paper is available as an R package from the CRAN repository. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:21 / 37
页数:17
相关论文
共 36 条
  • [1] Abdi Herve., 2007, ENCY MEASUREMENT STA, P1
  • [2] [Anonymous], 2006, WILEY SERIES PROBABI, DOI DOI 10.1002/0470010940
  • [3] [Anonymous], 2000, Understanding Robust and Exploratory Data Analysis
  • [4] A similarity-based generalization of fuzzy orderings preserving the classical axioms
    Bodenhofer, U
    [J]. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2000, 8 (05) : 593 - 610
  • [5] Bodenhofer U., 2008, Mathware & Soft Computing, V15, P5
  • [6] Bodenhofer U., 2011, ROCOCO R PACKAGE IMP
  • [7] Strict fuzzy orderings with a given context of similarity
    Bodenhofer, Ulrich
    Demirci, Mustafa
    [J]. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2008, 16 (02) : 147 - 178
  • [8] Bodenhofer U, 2007, NEW DIMENSIONS IN FUZZY LOGIC AND RELATED TECHNOLOGIES, VOL I, PROCEEDINGS, P321
  • [9] The Gaussian rank correlation estimator: robustness properties
    Boudt, Kris
    Cornelissen, Jonathan
    Croux, Christophe
    [J]. STATISTICS AND COMPUTING, 2012, 22 (02) : 471 - 483
  • [10] Testing monotonicity of regression
    Bowman, AW
    Jones, MC
    Gijbels, I
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 1998, 7 (04) : 489 - 500