Rate Distortion Theory for Descriptive Statistics

被引:3
作者
Harremoes, Peter [1 ,2 ]
机构
[1] Copenhagen Business Coll, Niels Brock, GSK Dept, Norre Voldgade 34, DK-1358 Copenhagen K, Denmark
[2] Ronne Alle 1 St, DK-2860 Soborg, Denmark
关键词
rate distortion theory; quantizer; descriptive statistics; clustering; Gaussian mixture models; outlier detection; linear regression; Anscombe quartet; qibla; early Islam; INFORMATION; CONVERGENCE; ENTROPY; LAW;
D O I
10.3390/e25030456
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate distortion theory can be used to analyze various datasets. The analysis involves testing, identification of outliers, choice of compression rate, calculation of optimal reconstruction points, and assigning "descriptive confidence regions" to the reconstruction points. We study four models or datasets of increasing complexity: clustering, Gaussian models, linear regression, and a dataset describing orientations of early Islamic mosques. These examples illustrate how rate distortion analysis may serve as a common framework for handling different statistical problems.
引用
收藏
页数:24
相关论文
共 58 条
[1]  
al Tamimi A.J., 2019, BYZANTINE ARABIC CHR
[2]  
Amine A., 2020, ISLAM PETRA REPONSE
[3]  
[Anonymous], 1971, Rate Distortion Theory
[4]   GRAPHS IN STATISTICAL-ANALYSIS [J].
ANSCOMBE, FJ .
AMERICAN STATISTICIAN, 1973, 27 (01) :17-21
[5]  
Banerjee A, 2005, J MACH LEARN RES, V6, P1705
[6]   The minimum description length principle in coding and modeling [J].
Barron, A ;
Rissanen, J ;
Yu, B .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (06) :2743-2760
[7]  
Barron A., 1991, P INT S INF THEOR
[8]   ENTROPY AND THE CENTRAL-LIMIT-THEOREM [J].
BARRON, AR .
ANNALS OF PROBABILITY, 1986, 14 (01) :336-342
[9]   THE STRONG ERGODIC THEOREM FOR DENSITIES - GENERALIZED SHANNON-MCMILLAN-BREIMAN THEOREM [J].
BARRON, AR .
ANNALS OF PROBABILITY, 1985, 13 (04) :1292-1303
[10]   COMPUTATION OF CHANNEL CAPACITY AND RATE-DISTORTION FUNCTIONS [J].
BLAHUT, RE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1972, 18 (04) :460-+