Bump hunting by topological data analysis

被引:5
作者
Sommerfeld, Max [1 ]
Heo, Giseon [2 ]
Kim, Peter [3 ]
Rush, Stephen T. [4 ]
Marron, J. S. [5 ]
机构
[1] Univ Gottingen, Felix Bernstein Inst Math Stat Biosci, D-37077 Gottingen, Germany
[2] Univ Alberta, Sch Dent, Edmonton, AB T6G 2R7, Canada
[3] Univ Guelph, Dept Math & Stat, Guelph, ON N1G 2W1, Canada
[4] Orebro Univ, Sch Med Sci, SE-70182 Orebro, Sweden
[5] Univ N Carolina, Dept Stat, Chapel Hill, NC 27599 USA
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会;
关键词
bootstrap; kernel density estimation; mode hunting; persistent homology; SiZer; BANDWIDTH SELECTION; SCALE-SPACE; MIXTURES;
D O I
10.1002/sta4.167
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A topological data analysis approach is taken to the challenging problem of finding and validating the statistical significance of local modes in a data set. As with the SIgnificance of the ZERo (SiZer) approach to this problem, statistical inference is performed in a multi-scale way, that is, across bandwidths. The key contribution is a two-parameter approach to the persistent homology representation. For each kernel bandwidth, a sub-level set filtration of the resulting kernel density estimate is computed. Inference based on the resulting persistence diagram indicates statistical significance of modes. It is seen through a simulated example, and by analysis of the famous Hidalgo stamps data, that the new method has more statistical power for finding bumps than SiZer. Copyright (c) 2017 John Wiley & Sons, Ltd.
引用
收藏
页码:462 / 471
页数:10
相关论文
共 33 条
[1]  
[Anonymous], 1986, MONOGR STAT APPL PRO
[2]  
[Anonymous], 2015, GLARMA PACKAGE
[3]  
[Anonymous], 1994, KERNEL SMOOTHING, DOI DOI 10.1201/B14876
[4]  
[Anonymous], 1993, INTRO BOOTSTRAP
[5]   Modelling the distribution of stamp paper thickness via finite normal mixtures: The 1872 Hidalgo stamp issue of Mexico revisited [J].
Basford, KE ;
McLachlan, GJ ;
York, MG .
JOURNAL OF APPLIED STATISTICS, 1997, 24 (02) :169-179
[6]   A statistical approach to persistent homology [J].
Bubenik, Peter ;
Kim, Peter T. .
HOMOLOGY HOMOTOPY AND APPLICATIONS, 2007, 9 (02) :337-362
[7]   The Theory of Multidimensional Persistence [J].
Carlsson, Gunnar ;
Zomorodian, Afra .
DISCRETE & COMPUTATIONAL GEOMETRY, 2009, 42 (01) :71-93
[8]   TOPOLOGY AND DATA [J].
Carlsson, Gunnar .
BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 2009, 46 (02) :255-308
[9]   Scale space view of curve estimation [J].
Chaudhuri, P ;
Marron, JS .
ANNALS OF STATISTICS, 2000, 28 (02) :408-428
[10]   SiZer for exploration of structures in curves [J].
Chaudhuri, P ;
Marron, JS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (447) :807-823