An adaptive approach for online monitoring of large-scale data streams

被引:0
作者
Cao, Shuchen [1 ]
Zhang, Ruizhi [2 ]
机构
[1] Univ Nebraska Lincoln, Dept Stat, Lincoln, NE USA
[2] Univ Georgia, Dept Stat, Athens, GA USA
关键词
False discovery rate; CUSUM; quickest change detection; process control; FALSE DISCOVERY RATE; CHANGE-POINT DETECTION; CHANGEPOINT DETECTION; SCHEMES;
D O I
10.1080/24725854.2023.2281580
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this article, we propose an adaptive top-r method to monitor large-scale data streams where the change may affect a set of unknown data streams at some unknown time. Motivated by parallel and distributed computing, we propose to develop global monitoring schemes by parallel running local detection procedures and then use the Benjamin-Hochberg false discovery rate control procedure to estimate the number of changed data streams adaptively. Our approach is illustrated in two concrete examples: one is a homogeneous case when all data streams are independent and identically distributed with the same known pre-change and post-change distributions. The other is when all data are normally distributed, and the mean shifts are unknown and can be positive or negative. Theoretically, we show that when the pre-change and post-change distributions are completely specified, our proposed method can estimate the number of changed data streams for both the pre-change and post-change status. Moreover, we perform simulations and two case studies to show its detection efficiency.
引用
收藏
页码:119 / 130
页数:12
相关论文
共 50 条
  • [31] Analysis of extended partial least squares for monitoring large-scale processes
    Chen, Q
    Kruger, U
    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2005, 13 (05) : 807 - 813
  • [32] Fast and covariate-adaptive method amplifies detection power in large-scale multiple hypothesis testing
    Zhang, Martin J.
    Xia, Fei
    Zou, James
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [33] Reproducible learning in large-scale graphical models
    Zhou, Jia
    Li, Yang
    Zheng, Zemin
    Li, Daoji
    JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 189
  • [34] Correlation and large-scale simultaneous significance testing
    Efron, Bradley
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (477) : 93 - 103
  • [35] A nonparametric mixture approach to density and null proportion estimation in large-scale multiple comparison problems
    Xue, Xiangjie
    Wang, Yong
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2023, 65 (01) : 49 - 75
  • [36] Online monitoring of high-dimensional binary data streams with application to extreme weather surveillance
    Fang, Zhiwen
    Li, Wendong
    Liu, Xin
    Pu, Xiaolong
    Xiang, Dongdong
    JOURNAL OF APPLIED STATISTICS, 2022, 49 (16) : 4122 - 4136
  • [37] Large-scale multiple testing under dependence
    Sun, Wenguang
    Cai, T. Tony
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 393 - 424
  • [38] Assessing the Effectiveness of Direct Data Merging Strategy in Long-Term and Large-Scale Pharmacometabonomics
    Cui, Xuejiao
    Yang, Qingxia
    Li, Bo
    Tang, Jing
    Zhang, Xiaoyu
    Li, Shuang
    Li, Fengcheng
    Hu, Jie
    Lou, Yan
    Qiu, Yunqing
    Xue, Weiwei
    Zhu, Feng
    FRONTIERS IN PHARMACOLOGY, 2019, 10
  • [39] ADAPTIVE CHANGE POINT MONITORING FOR HIGH-DIMENSIONAL DATA
    Wu, Teng
    Wang, Runmin
    Yan, Hao
    Shao, Xiaofeng
    STATISTICA SINICA, 2022, 32 (03) : 1583 - 1610
  • [40] Large-scale signal detection: A unified perspective
    Mukhopadhyay, Subhadeep
    BIOMETRICS, 2016, 72 (02) : 325 - 334