Anomaly Detection in Catalog Streams

被引:2
|
作者
Yang, Chen [1 ]
Du, Zhihui [2 ]
Meng, Xiaofeng [3 ]
Zhang, Xukang [3 ]
Hao, Xinli [3 ]
Bader, David A. [2 ]
机构
[1] Tsinghua Univ, China Natl Clearing Ctr, Dept Comp Sci & Technol, Beijing 100190, Peoples R China
[2] New Jersey Inst Technol, Dept Data Sci, Newark, NJ 07102 USA
[3] Renmin Univ China, Informat Sch, Beijing 100872, Peoples R China
关键词
Anomaly detection; Real-time systems; Monitoring; Transient analysis; Optimization; Filtering; Feature extraction; Streaming data analysis; anomaly detection; distributed stream processing; big scientific data;
D O I
10.1109/TBDATA.2022.3161925
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting anomalies with high accuracy and real time from large amounts of streaming data is a challenge for many real-world applications, such as smart city, astronomical observations, and remote sensing. This article focuses on a special kind of stream, catalog stream, whose high-level catalog structure can be used to analyze the stream effectively. We first formulate the anomaly detection in catalog streams as a constrained optimization problem based on a catalog stream matrix. Then, a novel filtering-identifying based anomaly detection algorithm (FIAD) is proposed, which includes two complementary strategies, true event identifying and false alarm filtering, data-oriented general method and domain-oriented specific method together, to detect truly valuable anomalies. Furthermore, different kinds of attention windows are developed to provide corresponding data for various algorithm components. A scalable and lightweight catalog stream processing framework CSPF is designed to support and implement the proposed method efficiently. A prototype system is developed to evaluate the proposed algorithm. Extensive experiments are conducted on the catalog stream data sets from an operational super large field-of-view high-cadence astronomy observation. The experimental results show that the proposed method can achieve a false-positive rate as low as 0.04%, reduces the false alarms by 98.6% compared with the existing methods, and the latency to handle each catalog is 2.1 seconds (much less than the required 15 seconds). Furthermore, a total of 36 transient candidates, including seven microlensing events, 27 superflares, and two dual-superflares, are detected from 21.67 million stars (involving 1.09 million catalogs) from one observation season.
引用
收藏
页码:294 / 311
页数:18
相关论文
共 50 条
  • [31] Anomaly detection of large scale network based on data streams
    Research Center of Computer Network and Information Security Technology, Harbin Institute of Technology, Harbin 150001, China
    Tongxin Xuebao, 2006, 2 (1-8):
  • [32] An adaptive algorithm for anomaly and novelty detection in evolving data streams
    Bouguelia, Mohamed-Rafik
    Nowaczyk, Slawomir
    Payberah, Amir H.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (06) : 1597 - 1633
  • [33] Anomaly Detection of Network Streams via Dense Subgraph Discovery
    Yan, Hao
    Zhang, Qianzhen
    Mao, Deming
    Lu, Ziyue
    Guo, Deke
    Chen, Sheng
    30TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2021), 2021,
  • [34] MSTREAM: Fast Anomaly Detection in Multi-Aspect Streams
    Bhatia, Siddharth
    Jain, Arjit
    Li, Pan
    Kumar, Ritesh
    Hooi, Bryan
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 3371 - 3382
  • [35] Anomaly detection in information streams without prior domain knowledge
    Beigi, M. S.
    Chang, S. -F.
    Ebadollahi, S.
    Verma, D. C.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2011, 55 (05)
  • [36] Robust Random Cut Forest Based Anomaly Detection On Streams
    Guha, Sudipto
    Mishra, Nina
    Roy, Gourav
    Schrijvers, Okke
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [37] Memory-efficient anomaly detection for online data streams
    He, Shiming
    Guo, Chenxi
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1201 - 1206
  • [38] An adaptive algorithm for anomaly and novelty detection in evolving data streams
    Mohamed-Rafik Bouguelia
    Slawomir Nowaczyk
    Amir H. Payberah
    Data Mining and Knowledge Discovery, 2018, 32 : 1597 - 1633
  • [39] MAD: Multi-Scale Anomaly Detection in Link Streams
    Bautista, Esteban
    Brisson, Laurent
    Bothorel, Cecile
    Smits, Gregory
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 38 - 46
  • [40] Local Community-Based Anomaly Detection in Graph Streams
    Christopoulos, Konstantinos
    Tsichlas, Konstantinos
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, PT I, AIAI 2024, 2024, 711 : 348 - 361