Some new invariant sum tests and MAD tests for the assessment of Benford's law

被引:1
作者
Koessler, Wolfgang [1 ]
Lenz, Hans-J. [2 ]
Wang, Xing D. [1 ]
机构
[1] Humboldt Univ, Inst Informat, Rudower Chaussee 25, D-12489 Berlin, Germany
[2] Free Univ Berlin, Inst Stat & Okonometrie, Boltzmannstr 20, D-14195 Berlin, Germany
关键词
Benford law; Goodness of fit test; Sum invariance; Data fraud; Data manipulation; Data quality;
D O I
10.1007/s00180-024-01463-8
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The Benford law is used world-wide for detecting non-conformance or data fraud of numerical data. It says that the significand of a data set from the universe is not uniformly, but logarithmically distributed. Especially, the first non-zero digit is One with an approximate probability of 0.3. There are several tests available for testing Benford, the best known are Pearson's chi 2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi <^>2$$\end{document}-test, the Kolmogorov-Smirnov test and a modified version of the MAD-test. In the present paper we propose some tests, three of the four invariant sum tests are new and they are motivated by the sum invariance property of the Benford law. Two distance measures are investigated, Euclidean and Mahalanobis distance of the standardized sums to the orign. We use the significands corresponding to the first significant digit as well as the second significant digit, respectively. Moreover, we suggest inproved versions of the MAD-test and obtain critical values that are independent of the sample sizes. For illustration the tests are applied to specifically selected data sets where prior knowledge is available about being or not being Benford. Furthermore we discuss the role of truncation of distributions.
引用
收藏
页码:3779 / 3800
页数:22
相关论文
共 35 条
  • [1] An invariant-sum characterization of Benford's law
    Allaart, PC
    [J]. JOURNAL OF APPLIED PROBABILITY, 1997, 34 (01) : 288 - 291
  • [2] [Anonymous], 2016, UNSTATS REPORT
  • [3] [Anonymous], 2021, TAGESSPIEGEL SO WAR, P18
  • [4] RESIDUAL LIFE TIME AT GREAT AGE
    BALKEMA, AA
    DEHAAN, L
    [J]. ANNALS OF PROBABILITY, 1974, 2 (05) : 792 - 804
  • [5] On Characterizations and Tests of Benford's Law
    Barabesi, Lucio
    Cerasa, Andrea
    Cerioli, Andrea
    Perrotta, Domenico
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (540) : 1887 - 1903
  • [6] Benford F., 1938, P AM PHILOS SOC, V78, P551, DOI DOI 10.2307/984802
  • [7] Berger A, 2024, BENFORD ONLINE BIBLI
  • [8] A basic theory of Benford's Law
    Berger, Arno
    Hill, Theodore P.
    [J]. PROBABILITY SURVEYS, 2011, 8 : 1 - 126
  • [9] Compressing Yahoo Mail
    Bergman, Aran
    Zohar, Eyal
    [J]. 2015 DATA COMPRESSION CONFERENCE (DCC), 2015, : 223 - 232
  • [10] Some New Tests of Conformity with Benford's Law
    Cerqueti, Roy
    Lupi, Claudio
    [J]. STATS, 2021, 4 (03): : 745 - 761