GRAPH-BASED CHANGE-POINT DETECTION

被引:113
作者
Chen, Hao [1 ]
Zhang, Nancy [2 ]
机构
[1] Univ Calif Davis, Dept Stat, Davis, CA 95616 USA
[2] Univ Penn, Wharton Sch, Dept Stat, Philadelphia, PA 19104 USA
关键词
Change-point; graph-based tests; nonparametrics; scan statistic; tail probability; high-dimensional data; complex data; network data; non-Euclidean data; LIKELIHOOD RATIO TESTS; CONFIDENCE-REGIONS; MULTIVARIATE; SEQUENCES; NETWORK;
D O I
10.1214/14-AOS1269
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the testing and estimation of change-points-locations where the distribution abruptly changes-in a data sequence. A new approach, based on scan statistics utilizing graphs representing the similarity between observations, is proposed. The graph-based approach is nonparametric, and can be applied to any data set as long as an informative similarity measure on the sample space can be defined. Accurate analytic approximations to the significance of graph-based scan statistics for both the single change-point and the changed interval alternatives are provided. Simulations reveal that the new approach has better power than existing approaches when the dimension of the data is moderate to high. The new approach is illustrated on two applications: The determination of authorship of a classic novel, and the detection of change in a network over time.
引用
收藏
页码:139 / 176
页数:38
相关论文
共 31 条
[1]  
Carlstein E., 1994, Change-Point Problems, V23
[2]  
CHEN H., 2014, GRAPH BASED CHANGE S, DOI [10.1214/14-A0S1269SUPP, DOI 10.1214/14-A0S1269SUPP]
[3]   GRAPH-BASED TESTS FOR TWO-SAMPLE COMPARISONS OF CATEGORICAL DATA [J].
Chen, Hao ;
Zhang, Nancy R. .
STATISTICA SINICA, 2013, 23 (04) :1479-1503
[4]  
Chen LHY, 2005, LECT NOTES SER INST, V4, P1
[5]  
COBB GW, 1978, BIOMETRIKA, V65, P243, DOI 10.2307/2335202
[6]  
COX DR, 1982, SCAND J STAT, V9, P147
[7]   An online Kernel change detection algorithm [J].
Desobry, F ;
Davy, M ;
Doncarli, C .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2005, 53 (08) :2961-2974
[8]   Inferring friendship network structure by using mobile phone data [J].
Eagle, Nathan ;
Pentland, Alex ;
Lazer, David .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (36) :15274-15278
[9]   MULTIVARIATE GENERALIZATIONS OF THE WALD-WOLFOWITZ AND SMIRNOV 2-SAMPLE TESTS [J].
FRIEDMAN, JH ;
RAFSKY, LC .
ANNALS OF STATISTICS, 1979, 7 (04) :697-717
[10]   Bayesian analysis of a multinomial sequence and homogeneity of literary style [J].
Girón, J ;
Ginebra, J ;
Riba, A .
AMERICAN STATISTICIAN, 2005, 59 (01) :19-30