Anomaly Detection in Catalog Streams

被引:2
|
作者
Yang, Chen [1 ]
Du, Zhihui [2 ]
Meng, Xiaofeng [3 ]
Zhang, Xukang [3 ]
Hao, Xinli [3 ]
Bader, David A. [2 ]
机构
[1] Tsinghua Univ, China Natl Clearing Ctr, Dept Comp Sci & Technol, Beijing 100190, Peoples R China
[2] New Jersey Inst Technol, Dept Data Sci, Newark, NJ 07102 USA
[3] Renmin Univ China, Informat Sch, Beijing 100872, Peoples R China
关键词
Anomaly detection; Real-time systems; Monitoring; Transient analysis; Optimization; Filtering; Feature extraction; Streaming data analysis; anomaly detection; distributed stream processing; big scientific data;
D O I
10.1109/TBDATA.2022.3161925
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting anomalies with high accuracy and real time from large amounts of streaming data is a challenge for many real-world applications, such as smart city, astronomical observations, and remote sensing. This article focuses on a special kind of stream, catalog stream, whose high-level catalog structure can be used to analyze the stream effectively. We first formulate the anomaly detection in catalog streams as a constrained optimization problem based on a catalog stream matrix. Then, a novel filtering-identifying based anomaly detection algorithm (FIAD) is proposed, which includes two complementary strategies, true event identifying and false alarm filtering, data-oriented general method and domain-oriented specific method together, to detect truly valuable anomalies. Furthermore, different kinds of attention windows are developed to provide corresponding data for various algorithm components. A scalable and lightweight catalog stream processing framework CSPF is designed to support and implement the proposed method efficiently. A prototype system is developed to evaluate the proposed algorithm. Extensive experiments are conducted on the catalog stream data sets from an operational super large field-of-view high-cadence astronomy observation. The experimental results show that the proposed method can achieve a false-positive rate as low as 0.04%, reduces the false alarms by 98.6% compared with the existing methods, and the latency to handle each catalog is 2.1 seconds (much less than the required 15 seconds). Furthermore, a total of 36 transient candidates, including seven microlensing events, 27 superflares, and two dual-superflares, are detected from 21.67 million stars (involving 1.09 million catalogs) from one observation season.
引用
收藏
页码:294 / 311
页数:18
相关论文
共 50 条
  • [21] Anomaly Detection on Data Streams - A LSTM's Diary
    Augenstein, Christoph
    Franczyk, Bogdan
    RESEARCH CHALLENGES IN INFORMATION SCIENCE (RCIS 2020), 2020, 385 : 369 - 377
  • [22] Real-Time Anomaly Detection in Edge Streams
    Bhatia, Siddharth
    Liu, Rui
    Hooi, Bryan
    Yoon, Minji
    Shin, Kijung
    Faloutsos, Christos
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (04)
  • [23] An Adaptive Anomaly Detection Algorithm for Periodic Data Streams
    Hasani, Zirije
    Jakimovski, Boro
    Velinov, Goran
    Kon-Popovska, Margita
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2018, PT I, 2018, 11314 : 385 - 397
  • [24] Anomaly Detection in Data Streams using Fuzzy Logic
    Khan, Muhammad Umair
    2009 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES, 2009, : 126 - 133
  • [25] A Partitioning Approach to Scaling Anomaly Detection in Graph Streams
    Eberle, William
    Holder, Lawrence
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [26] Unsupervised Multi Scale Anomaly Detection in Streams of Events
    Plessis, Quentin
    Suzuki, Masaki
    Kitahara, Takeshi
    2016 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2016,
  • [27] Self-organizing anomaly detection in data streams
    Forestiero, Agostino
    INFORMATION SCIENCES, 2016, 373 : 321 - 336
  • [28] Online Anomaly Detection over Big Data Streams
    Rettig, Laura
    Khayati, Mourad
    Cudre-Mauroux, Philippe
    Piorkowski, Michal
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1113 - 1122
  • [29] Supervised Anomaly Detection in Uncertain Pseudoperiodic Data Streams
    Ma, Jiangang
    Sun, Le
    Wang, Hua
    Zhang, Yanchun
    Aickelin, Uwe
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2016, 16 (01)
  • [30] Effective Anomaly Detection in Sensor Networks Data Streams
    Budhaditya, Saha
    Pham, Duc-Son
    Lazarescu, Mihai
    Venkatesh, Svetha
    2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 722 - 727