Spatial Rank-Based Augmentation for Nonparametric Online Monitoring and Adaptive Sampling of Big Data Streams

被引:6
|
作者
Zan, Xin [1 ]
Wang, Di [2 ]
Xian, Xiaochen [1 ]
机构
[1] Univ Florida, Dept Ind & Syst Engn, Gainesville, FL 32611 USA
[2] Shanghai Jiao Tong Univ, Sch Mech Engn, Dept Ind Engn & Management, Shanghai, Peoples R China
基金
美国国家科学基金会; 上海市自然科学基金; 中国国家自然科学基金;
关键词
Data augmentation; Distribution-free; Internet of Things (IoT); Partial observations; Statistical process control (SPC); CONTROL CHARTS; MEAN VECTOR; THINGS IOT; INTERNET;
D O I
10.1080/00401706.2022.2143903
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The age of Internet of Things (IoT) has witnessed the rapid development of modern data acquisition devices and communicating-actuating networks, which enables the generation of big data streams shared across platforms for remote and efficient decision making of many critical systems. The monitoring of big data streams remains a challenging task in various practical applications mainly due to their complexity in interrelationships, large volume, and high velocity, which places prohibitive demands on monitoring methodologies and resources. To tackle the challenges of monitoring unexchangeable and correlated big data streams with only partial observations available under resource constraints, we propose a method by incorporating spatial rank-based statistics with effective data augmentation techniques for the online unobservable data streams that can analytically inform the monitoring and sampling decisions based only on partially observed data streams. By exploiting historical data, the proposed method preserves strong descriptive power of general big data streams under partial observations and can explicitly use the correlation among data streams, and thus allows effective monitoring and equitable sampling over general heterogeneous and correlated big data streams, which is free of simplified assumptions (e.g., exchangeability) compared to existing methods. Theoretical investigations are carried out to evaluate the effectiveness of the augmentation statistics as well as the sampling strategy, which guarantee the superiority of the sampling performance over existing methods. Simulations under various scenarios and two real case studies are also conducted to evaluate and validate the performance of the proposed method.
引用
收藏
页码:243 / 256
页数:14
相关论文
共 50 条
  • [31] Adaptive stochastic configuration network based on online active learning for evolving data streams
    Guo, Yinan
    Pu, Jiayang
    He, Jiale
    Jiao, Botao
    Ji, Jianjiao
    Yang, Shengxiang
    INFORMATION SCIENCES, 2025, 711
  • [32] Online Monitoring of Heterogeneous Partially Observable Data Streams Based on Q-Learning
    Li, Haoqian
    Ye, Honghan
    Cheng, Jing-Ru C.
    Liu, Kaibo
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 1 - 16
  • [33] Online Adaptive Method for Disease Prediction Based on Big Data of Clinical Laboratory Test
    Yang, Xianglin
    Tong, Yunhai
    Meng, Xiangfeng
    Zhao, Shuai
    Xu, Zhi
    Li, Yanjun
    Liu, Guozhen
    Tan, Shaohua
    PROCEEDINGS OF 2016 IEEE 7TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2016), 2016, : 889 - 892
  • [34] Online Adaptive Decoding for MI-BCI Based on Stimulation and Feature Optimization and Data Augmentation
    Jiao, Yuze
    Wang, Weiqun
    Liu, Shengda
    Wang, Jiaxing
    Hou, Zeng-Guang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [35] Research on online monitoring platform of charging pile based on big data soft computing
    Zhang, Jieliang
    Jiang, Libin
    Zhang, Huanghui
    Zheng, Peiqiang
    Fang, Jie
    Zhao, Sikan
    Lin, Yong
    ELECTRICAL ENGINEERING, 2024,
  • [36] Blockchain-Based Dynamic Cloud Data Integrity Auditing via Non-Leaf Node Sampling of Rank-Based Merkle Hash Tree
    Wang, Chenxu
    Sun, Yifan
    Liu, Boyang
    Xue, Lei
    Guan, Xiaohong
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (05): : 3931 - 3942
  • [37] Double Sampling Adaptive Thresholding LASSO Variability Chart for Phase II Monitoring of High-Dimensional Data Streams
    Salmasnia, Ali
    Maleki, Mohammad Reza
    Mirzaei, Mohadeseh
    JOURNAL OF INDUSTRIAL INTEGRATION AND MANAGEMENT-INNOVATION AND ENTREPRENEURSHIP, 2023,
  • [38] A Method for Solving Approximate Partition Boundaries of Spatial Big Data Based on Histogram Bucket Sampling
    Tian, Ruijie
    Chen, Tiansheng
    Zhai, Huawei
    Zhang, Weishi
    Wang, Fei
    SYMMETRY-BASEL, 2022, 14 (05):
  • [39] Deep Bayesian surrogate models with adaptive online sampling for ensemble-based data assimilation
    Zhang, Jinding
    Zhang, Kai
    Liu, Piyang
    Zhang, Liming
    Fu, Wenhao
    Chen, Xu
    Wang, Jian
    Liu, Chen
    Yang, Yongfei
    Sun, Hai
    Yao, Jun
    JOURNAL OF HYDROLOGY, 2025, 694
  • [40] A Cloud-based Big Data Sentiment Analysis Application for Enterprises' Brand Monitoring in Social Media Streams
    Tedeschi, A.
    Benedetto, F.
    2015 IEEE 1ST INTERNATIONAL FORUM ON RESEARCH AND TECHNOLOGIES FOR SOCIETY AND INDUSTRY (RTSI 2015) PROCEEDINGS, 2015,