Change point detection in high dimensional data with U-statistics

被引:0
作者
Boniece, B. Cooper [1 ]
Horvath, Lajos [2 ]
Jacobs, Peter M. [3 ]
机构
[1] Drexel Univ, Dept Math, Philadelphia, PA 19104 USA
[2] Univ Utah, Dept Math, Salt Lake City, UT 84112 USA
[3] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA
基金
英国科研创新办公室;
关键词
Large dimensional vectors; U-statistics; Weak convergence; Change point; Twitter data; MULTIVARIATE; DISTANCE; METRICS;
D O I
10.1007/s11749-023-00900-y
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of detecting distributional changes in a sequence of high dimensional data. Our approach combines two separate statistics stemming from L-p norms whose behavior is similar under H-0 but potentially different under HA, leading to a testing procedure that that is flexible against a variety of alternatives. We establish the asymptotic distribution of our proposed test statistics separately in cases of weakly dependent and strongly dependent coordinates as min{N, d} -> infinity, where N denotes sample size and d is the dimension, and establish consistency of testing and estimation procedures in high dimensions under one-change alternative settings. Computational studies in single and multiple change point scenarios demonstrate our method can outperform other nonparametric approaches in the literature for certain alternatives in high dimensions. We illustrate our approach through an application to Twitter data concerning the mentions of U.S. governors.
引用
收藏
页码:400 / 452
页数:53
相关论文
共 41 条
  • [31] A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data
    Matteson, David S.
    James, Nicholas A.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (505) : 334 - 345
  • [32] McDonald DR., 1980, Canadian Journal of Statistics, V8, P115
  • [33] THE MEASUREMENT OF CLASSIFICATION AGREEMENT - AN ADJUSTMENT TO THE RAND STATISTIC FOR CHANCE AGREEMENT
    MOREY, LC
    AGRESTI, A
    [J]. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1984, 44 (01) : 33 - 37
  • [34] Olmo J, 2011, STUD NONLINEAR DYN E, V15
  • [35] Petrov V.V., 1995, Limit Theorems of Probability Theory, DOI DOI 10.1214/08-EJS176
  • [37] ON A HEURISTIC METHOD OF TEST CONSTRUCTION AND ITS USE IN MULTIVARIATE ANALYSIS
    ROY, SN
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1953, 24 (02): : 220 - 238
  • [38] Emergency flood detection using multiple information sources: Integrated analysis of natural hazard monitoring and social media data
    Shoyama, Kikuko
    Cui, Qinglin
    Hanashima, Makoto
    Sano, Hiroaki
    Usuda, Yuichiro
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2021, 767
  • [39] Hierarchical clustering via joint between-within distances:: Extending Ward's minimum variance method
    Székely, GJ
    Rizzo, ML
    [J]. JOURNAL OF CLASSIFICATION, 2005, 22 (02) : 151 - 183
  • [40] Change Point Detection in Terrorism-Related Online Content Using Deep Learning Derived Indicators
    Theodosiadou, Ourania
    Pantelidou, Kyriaki
    Bastas, Nikolaos
    Chatzakou, Despoina
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    [J]. INFORMATION, 2021, 12 (07)