Robust archetypoids for anomaly detection in big functional data

被引:16
作者
Vinue, Guillermo [1 ]
Epifanio, Irene [2 ]
机构
[1] Katholieke Univ Leuven, Leuven, Belgium
[2] Univ Jaume 1, Castellon De La Plana, Spain
关键词
Anomaly detection; Functional data analysis; Archetypal analysis; Big data; R package; OUTLIER DETECTION; R PACKAGE; MULTIVARIATE; LOCATION; SERIES;
D O I
10.1007/s11634-020-00412-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Archetypoid analysis (ADA) has proven to be a successful unsupervised statistical technique to identify extreme observations in the periphery of the data cloud, both in classical multivariate data and functional data. However, two questions remain open in this field: the use of ADA for outlier detection and its scalability. We propose to use robust functional archetypoids and adjusted boxplot to pinpoint functional outliers. Furthermore, we present a new archetypoid algorithm for obtaining results from large data sets in reasonable time. Functional time series are occurring in many practical problems, so this paper focuses on functional data settings. The new algorithm for detecting functional anomalies, called CRO-FADALARA, can be used with both univariate and multivariate curves. Our proposal for outlier detection is compared with all the state-of-the-art methods in a controlled study, showing a good performance. Furthermore, CRO-FADALARA is applied to two large time series data sets, where outliers curves are discussed and the reduction in computational time is clearly stated. A third case study with a small ECG data set is discussed, given its importance in functional data scenarios. All data, R code and a new R package are freely available.
引用
收藏
页码:437 / 462
页数:26
相关论文
共 50 条
  • [31] Comparison of the Statistical and Autoencoder Approach for Anomaly Detection in Big Data
    Mali, Barasha
    [J]. 2024 5TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS AND PRACTICES, IBDAP, 2024, : 22 - 25
  • [32] An unsupervised anomaly detection approach based on industrial big data
    Zhang, Cong
    Zhu, Yongsheng
    Ren, Zhijun
    Chen, Kaida
    [J]. 2019 2ND WORLD CONFERENCE ON MECHANICAL ENGINEERING AND INTELLIGENT MANUFACTURING (WCMEIM 2019), 2019, : 703 - 709
  • [33] EXAD: A System for Explainable Anomaly Detection on Big Data Traces
    Song, Fei
    Diao, Yanlei
    Read, Jesse
    Stiegler, Arnaud
    Bifet, Albert
    [J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 1435 - 1440
  • [34] Variational LSTM Enhanced Anomaly Detection for Industrial Big Data
    Zhou, Xiaokang
    Hu, Yiyong
    Liang, Wei
    Ma, Jianhua
    Jin, Qun
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (05) : 3469 - 3477
  • [35] Anomaly detection for cellular networks using big data analytics
    Li, Bing
    Zhao, Shengjie
    Zhang, Rongqing
    Shi, Qingjiang
    Yang, Kai
    [J]. IET COMMUNICATIONS, 2019, 13 (20) : 3351 - 3359
  • [36] Big Data Based Bridge Anomaly Detection and Situational Awareness
    Yuan, Guowen
    Zhang, Caixia
    Hu, Shaolin
    Guo, Jing
    Wang, Xiangdong
    [J]. 2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 3864 - 3868
  • [37] Robust, Deep and Inductive Anomaly Detection
    Chalapathy, Raghavendra
    Menon, Aditya Krishna
    Chawla, Sanjay
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT I, 2017, 10534 : 36 - 51
  • [38] Local Correlation Integral Approach for Anomaly Detection Using Functional Data
    Donoso, Jorge R. Sosa
    Flores, Miguel
    Naya, Salvador
    Tarrio-Saavedra, Javier
    [J]. MATHEMATICS, 2023, 11 (04)
  • [39] Anomaly Detection of Passenger OD on Nanjing Metro Based on Smart Card Big Data
    Yu, Wei
    Bai, Hua
    Chen, Jun
    Yan, Xingchen
    [J]. IEEE ACCESS, 2019, 7 : 138624 - 138636
  • [40] Anomaly Detection Using Deep Learning and Big Data Analytics for the Insider Threat Platform
    Alam, Abu
    Barron, Harry
    [J]. INTELLIGENT COMPUTING, VOL 1, 2022, 506 : 512 - 531