共 50 条
On-line monitoring data quality of high-dimensional data streams
被引:10
作者:
Qi, Dequan
[2
]
Li, Zhonghua
Wang, Zhaojun
[1
]
机构:
[1] Nankai Univ, Inst Stat, Tianjin 300071, Peoples R China
[2] Jilin Med Univ, Dept Math, Jilin, Jilin, Peoples R China
基金:
中国国家自然科学基金;
关键词:
Data quality;
false discovery rate;
MEWMA;
statistical process control;
FALSE DISCOVERY RATE;
CONTROL CHARTS;
SCHEMES;
IMPACT;
TIME;
D O I:
10.1080/00949655.2015.1106542
中图分类号:
TP39 [计算机的应用];
学科分类号:
081203 ;
0835 ;
摘要:
In recent years, effective monitoring of data quality has increasingly attracted attention of researchers in the area of statistical process control. Among the relevant research on this topic, none used multivariate methods to control the multidimensional data quality process, but instead relied on multiple univariate control charts. Based on a novel one-sided multivariate exponentially weighted moving average (MEWMA) chart, we propose a conditional false discovery rate-adjusted scheme to on-line monitor the data quality of high-dimensional data streams. With thousands of input data streams, the average run length loses its usefulness because one will likely have out-of-control signals at each time period. Hence, we first control the percentage of signals that are false alarms. Then, we compare the power of the proposed MEWMA scheme with that of two alternative methods. Compared with two competitors, numerical results show that the proposed MEWMA scheme has higher average power.
引用
收藏
页码:2204 / 2216
页数:13
相关论文
共 50 条