An adaptive approach for online monitoring of large-scale data streams

被引:0
作者
Cao, Shuchen [1 ]
Zhang, Ruizhi [2 ]
机构
[1] Univ Nebraska Lincoln, Dept Stat, Lincoln, NE USA
[2] Univ Georgia, Dept Stat, Athens, GA USA
关键词
False discovery rate; CUSUM; quickest change detection; process control; FALSE DISCOVERY RATE; CHANGE-POINT DETECTION; CHANGEPOINT DETECTION; SCHEMES;
D O I
10.1080/24725854.2023.2281580
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this article, we propose an adaptive top-r method to monitor large-scale data streams where the change may affect a set of unknown data streams at some unknown time. Motivated by parallel and distributed computing, we propose to develop global monitoring schemes by parallel running local detection procedures and then use the Benjamin-Hochberg false discovery rate control procedure to estimate the number of changed data streams adaptively. Our approach is illustrated in two concrete examples: one is a homogeneous case when all data streams are independent and identically distributed with the same known pre-change and post-change distributions. The other is when all data are normally distributed, and the mean shifts are unknown and can be positive or negative. Theoretically, we show that when the pre-change and post-change distributions are completely specified, our proposed method can estimate the number of changed data streams for both the pre-change and post-change status. Moreover, we perform simulations and two case studies to show its detection efficiency.
引用
收藏
页码:119 / 130
页数:12
相关论文
共 50 条
  • [21] Semi-parametric inference for large-scale data with temporally dependent noise
    Zhang, Chunming
    Guo, Xiao
    Chen, Min
    Du, Xinze
    ELECTRONIC JOURNAL OF STATISTICS, 2023, 17 (02): : 2962 - 3007
  • [22] Assessing mean and median filters in multiple testing for large-scale imaging data
    Zhang, Chunming
    TEST, 2014, 23 (01) : 51 - 71
  • [23] Large-scale simultaneous hypothesis testing in monitoring carbon content from French soil database - A semi-parametric mixture approach
    Chauveau, Didier
    Saby, Nicolas P. A.
    Orton, Thomas G.
    Lemercier, Blandine
    Walter, Christian
    Arrouays, Dominique
    GEODERMA, 2014, 219 : 117 - 124
  • [24] Foundations of Large-Scale Sequential Experimentation
    Ramdas, Aaditya
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 3211 - 3212
  • [25] Large-Scale Multiple Testing of Correlations
    Cai, T. Tony
    Liu, Weidong
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 229 - 240
  • [26] Online Monitoring of Heterogeneous Partially Observable Data Streams Based on Q-Learning
    Li, Haoqian
    Ye, Honghan
    Cheng, Jing-Ru C.
    Liu, Kaibo
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 4802 - 4817
  • [27] Online Monitoring of High-Dimensional Data Streams With Deep Q-Network
    Li, Haoqian
    Zheng, Ziqian
    Liu, Kaibo
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 12606 - 12620
  • [28] Online nonparametric monitoring of heterogeneous data streams with partial observations based on Thompson sampling
    Ye, Honghan
    Xian, Xiaochen
    Cheng, Jing-Ru C.
    Hable, Brock
    Shannon, Robert W.
    Elyaderani, Mojtaba Kadkhodaie
    Liu, Kaibo
    IISE TRANSACTIONS, 2023, 55 (04) : 392 - 404
  • [29] A Novel Change Detecting Method for Monitoring Data Streams in Data Centers
    Wang, Chao
    Huang, Jianwen
    Zeng, Haitian
    Wang, Zhaoguo
    Xue, Yibo
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 7168 - 7173
  • [30] On utilizing weak estimators to achieve the online classification of data streams
    Tavasoli, Hanane
    Oommen, B. John
    Yazidi, Anis
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 86 : 11 - 31