A multivariate Bayesian scan statistic for early event detection and characterization

被引:46
作者
Neill, Daniel B. [1 ]
Cooper, Gregory F. [2 ]
机构
[1] Carnegie Mellon Univ, Sch Publ Policy & Management, HJ Heinz III Coll, Pittsburgh, PA 15213 USA
[2] Univ Pittsburgh, Dept Biomed Informat, Pittsburgh, PA 15260 USA
基金
美国国家科学基金会;
关键词
Event detection; Event characterization; Biosurveillance; Scan statistics;
D O I
10.1007/s10994-009-5144-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present the multivariate Bayesian scan statistic (MBSS), a general framework for event detection and characterization in multivariate spatial time series data. MBSS integrates prior information and observations from multiple data streams in a principled Bayesian framework, computing the posterior probability of each type of event in each space-time region. MBSS learns a multivariate Gamma-Poisson model from historical data, and models the effects of each event type on each stream using expert knowledge or labeled training examples. We evaluate MBSS on various disease surveillance tasks, detecting and characterizing outbreaks injected into three streams of Pennsylvania medication sales data. We demonstrate that MBSS can be used both as a "general" event detector, with high detection power across a variety of event types, and a "specific" detector that incorporates prior knowledge of an event's effects to achieve much higher detection power. MBSS has many other advantages over previous event detection approaches, including faster computation and easy interpretation and visualization of results, and allows faster and more accurate event detection by integrating information from the multiple streams. Most importantly, MBSS can model and differentiate between multiple event types, thus distinguishing between events requiring urgent responses and other, less relevant patterns in the data.
引用
收藏
页码:261 / 282
页数:22
相关论文
共 33 条
[11]   Evaluating cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico [J].
Kulldorff, M ;
Athas, WF ;
Feuer, EJ ;
Miller, BA ;
Key, CR .
AMERICAN JOURNAL OF PUBLIC HEALTH, 1998, 88 (09) :1377-1380
[12]   Prospective time periodic geographical disease surveillance using a scan statistic [J].
Kulldorff, M .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2001, 164 :61-72
[13]   A spatial scan statistic [J].
Kulldorff, M .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1997, 26 (06) :1481-1496
[14]   SPATIAL DISEASE CLUSTERS - DETECTION AND INFERENCE [J].
KULLDORFF, M ;
NAGARWALLA, N .
STATISTICS IN MEDICINE, 1995, 14 (08) :799-810
[15]   Multivariate scan statistics for disease surveillance [J].
Kulldorff, Martin ;
Mostashari, Farzad ;
Duczmal, Luiz ;
Yih, W. Katherine ;
Kleinman, Ken ;
Platt, Richard .
STATISTICS IN MEDICINE, 2007, 26 (08) :1824-1833
[16]   An elliptic spatial scan statistic [J].
Kulldorff, Martin ;
Huang, Lan ;
Pickle, Linda ;
Duczmal, Luiz .
STATISTICS IN MEDICINE, 2006, 25 (22) :3929-3943
[17]  
Mollie A., 1999, Disease Mapping and Risk Assessmentfor Public Health
[18]  
Neill D., 2006, Advances in Neural Information Processing Systems 18, P1003
[19]  
Neill D.B., 2005, Proc. KDD 2005 Workshop on Data Mining Methodsfor Anomaly Detection, P41
[20]  
Neill D.B., 2005, MMWR-MORBID MORTAL W, V54, P197