Anomaly Detection in Catalog Streams

被引:2
|
作者
Yang, Chen [1 ]
Du, Zhihui [2 ]
Meng, Xiaofeng [3 ]
Zhang, Xukang [3 ]
Hao, Xinli [3 ]
Bader, David A. [2 ]
机构
[1] Tsinghua Univ, China Natl Clearing Ctr, Dept Comp Sci & Technol, Beijing 100190, Peoples R China
[2] New Jersey Inst Technol, Dept Data Sci, Newark, NJ 07102 USA
[3] Renmin Univ China, Informat Sch, Beijing 100872, Peoples R China
关键词
Anomaly detection; Real-time systems; Monitoring; Transient analysis; Optimization; Filtering; Feature extraction; Streaming data analysis; anomaly detection; distributed stream processing; big scientific data;
D O I
10.1109/TBDATA.2022.3161925
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting anomalies with high accuracy and real time from large amounts of streaming data is a challenge for many real-world applications, such as smart city, astronomical observations, and remote sensing. This article focuses on a special kind of stream, catalog stream, whose high-level catalog structure can be used to analyze the stream effectively. We first formulate the anomaly detection in catalog streams as a constrained optimization problem based on a catalog stream matrix. Then, a novel filtering-identifying based anomaly detection algorithm (FIAD) is proposed, which includes two complementary strategies, true event identifying and false alarm filtering, data-oriented general method and domain-oriented specific method together, to detect truly valuable anomalies. Furthermore, different kinds of attention windows are developed to provide corresponding data for various algorithm components. A scalable and lightweight catalog stream processing framework CSPF is designed to support and implement the proposed method efficiently. A prototype system is developed to evaluate the proposed algorithm. Extensive experiments are conducted on the catalog stream data sets from an operational super large field-of-view high-cadence astronomy observation. The experimental results show that the proposed method can achieve a false-positive rate as low as 0.04%, reduces the false alarms by 98.6% compared with the existing methods, and the latency to handle each catalog is 2.1 seconds (much less than the required 15 seconds). Furthermore, a total of 36 transient candidates, including seven microlensing events, 27 superflares, and two dual-superflares, are detected from 21.67 million stars (involving 1.09 million catalogs) from one observation season.
引用
收藏
页码:294 / 311
页数:18
相关论文
共 50 条
  • [1] Anomaly detection in the Open Supernova Catalog
    Pruzhinskaya, M. V.
    Malanchev, K. L.
    Kornilov, M. V.
    Ishida, E. E. O.
    Mondon, F.
    Volnova, A. A.
    Korolev, V. S.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2019, 489 (03) : 3591 - 3608
  • [2] Explainable Anomaly Detection in Industrial Streams
    Jakubowski, Jakub
    Stanisz, Przemyslaw
    Bobek, Szymon
    Nalepa, Grzegorz J.
    ARTIFICIAL INTELLIGENCE-ECAI 2023 INTERNATIONAL WORKSHOPS, PT 1, XAI3, TACTIFUL, XI-ML, SEDAMI, RAAIT, AI4S, HYDRA, AI4AI, 2023, 2024, 1947 : 87 - 100
  • [3] Anomaly detection for smartphone data streams
    Mirsky, Yisroel
    Shabtai, Asaf
    Shapira, Bracha
    Elovici, Yuval
    Rokach, Lior
    PERVASIVE AND MOBILE COMPUTING, 2017, 35 : 83 - 107
  • [4] Conditional anomaly detection in event streams
    Huber, Marco F.
    AT-AUTOMATISIERUNGSTECHNIK, 2017, 65 (04) : 233 - 244
  • [5] Anomaly Pattern Detection on Data Streams
    Park, Cheong Hee
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 689 - 692
  • [6] Anomaly Detection in Streams with Extreme Value Theory
    Siffer, Alban
    Fouque, Pierre-Alain
    Termier, Alexandre
    Largouet, Christine
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1067 - 1075
  • [7] Outlier and anomaly pattern detection on data streams
    Cheong Hee Park
    The Journal of Supercomputing, 2019, 75 : 6118 - 6128
  • [8] Anomaly Detection on Data Streams for Smart Agriculture
    Moso, Juliet Chebet
    Cormier, Stephane
    de Runz, Cyril
    Fouchal, Hacene
    Wandeto, John Mwangi
    AGRICULTURE-BASEL, 2021, 11 (11):
  • [9] OHODIN - Online Anomaly Detection for Data Streams
    Gruhl, Christian
    Tomforde, Sven
    2021 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS COMPANION (ACSOS-C 2021), 2021, : 193 - 197
  • [10] Review of Anomaly Detection Algorithms for Data Streams
    Lu, Tianyuan
    Wang, Lei
    Zhao, Xiaoyong
    APPLIED SCIENCES-BASEL, 2023, 13 (10):