Clustering-based real-time anomaly detection-A breakthrough in big data technologies

被引:46
|
作者
Habeeb, Riyaz Ahamed Ariyaluran [1 ]
Nasaruddin, Fariza [1 ]
Gani, Abdullah [6 ]
Amanullah, Mohamed Ahzam [3 ]
Hashem, Ibrahim Abaker Targio [2 ]
Ahmed, Ejaz [4 ]
Imran, Muhammad [5 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Informat Syst, Kuala Lumpur 50603, Malaysia
[2] Taylors Univ, Sch Comp & Informat Technol, Subang Jaya, Malaysia
[3] Telekom Res & Dev Sdn Bhd, Res & Innovat Dev, Cyberjaya, Malaysia
[4] Univ Malaya, Ctr Mobile Cloud Comp Res C4MCCR, Kuala Lumpur, Malaysia
[5] King Saud Univ, Coll Appl Comp Sci, Riyadh, Saudi Arabia
[6] Univ Malaya, Dept Comp Syst & Technol, Fac Comp Sci & Informat Technol, Kuala Lumpur, Malaysia
关键词
DETECTION SYSTEM; FRAMEWORK; INTERNET; MACHINE;
D O I
10.1002/ett.3647
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Off late, the ever increasing usage of a connected Internet-of-Things devices has consequently augmented the volume of real-time network data with high velocity. At the same time, threats on networks become inevitable; hence, identifying anomalies in real time network data has become crucial. To date, most of the existing anomaly detection approaches focus mainly on machine learning techniques for batch processing. Meanwhile, detection approaches which focus on the real-time analytics somehow deficient in its detection accuracy while consuming higher memory and longer execution time. As such, this paper proposes a novel framework which focuses on real-time anomaly detection based on big data technologies. In addition, this paper has also developed streaming sliding window local outlier factor coreset clustering algorithms (SSWLOFCC), which was then implemented into the framework. The proposed framework that comprises BroIDS, Flume, Kafka, Spark streaming, SparkMLlib, Matplot and HBase was evaluated to substantiate its efficacy, particularly in terms of accuracy, memory consumption, and execution time. The evaluation is done by performing critical comparative analysis using existing approaches, such as K-means, hierarchical density-based spatial clustering of applications with noise (HDBSCAN), isolation forest, spectral clustering and agglomerative clustering. Moreover, Adjusted Rand Index and memory profiler package were used for the evaluation of the proposed framework against the existing approaches. The outcome of the evaluation has substantially proven the efficacy of the proposed framework with a much higher accuracy rate of 96.51% when compared to other algorithms. Besides, the proposed framework also outperformed the existing algorithms in terms of lesser memory consumption and execution time. Ultimately the proposed solution enable analysts to precisely track and detect anomalies in real time.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Clustering-based anomaly detection in multivariate time series data
    Li, Jinbo
    Izakian, Hesam
    Pedrycz, Witold
    Jamal, Iqbal
    APPLIED SOFT COMPUTING, 2021, 100
  • [2] Clustering-based anomaly detection in multivariate time series data
    Li, Jinbo
    Izakian, Hesam
    Pedrycz, Witold
    Jamal, Iqbal
    Applied Soft Computing, 2021, 100
  • [3] Real-time big data processing for anomaly detection: A Survey
    Habeeb, Riyaz Ahamed Ariyaluran
    Nasaruddin, Fariza
    Gani, Abdullah
    Hashem, Ibrahim Abaker Targio
    Ahmed, Ejaz
    Imran, Muhammad
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2019, 45 : 289 - 307
  • [4] Unsupervised Network Anomaly Detection in Real-Time on Big Data
    Dromard, Juliette
    Roudiere, Gilles
    Owezarski, Philippe
    NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2015), 2015, 539 : 197 - 206
  • [5] Survey on Real-time Anomaly Detection Technology for Big Data Streams
    Luo, Yuanvan
    Du, Xuehui
    Sun, Yi
    PROCEEDINGS OF 2018 12TH IEEE INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2018, : 26 - 30
  • [6] Clustering-Based Data Aggregation and Routing for Real-Time WirelessHART Communication
    Li, Feng
    Wang, Chunhui
    Ju, Lei
    Jia, Zhiping
    CHALLENGES AND OPPORTUNITY WITH BIG DATA, 2017, 10228 : 43 - 51
  • [7] Clustering-Based Anomaly Detection in Multi-View Data
    Alvarez, Alejandro Marcos
    Yamada, Makoto
    Kimura, Akisato
    Iwata, Tomoharu
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1545 - 1548
  • [8] Time series anomaly detection via clustering-based representation
    Enayati, Elham
    Mortazavi, Reza
    Basiri, Abdolali
    Ghasemian, Javad
    Moallem, Mahmoud
    EVOLVING SYSTEMS, 2024, 15 (04) : 1115 - 1136
  • [9] Deep Convolutional Clustering-Based Time Series Anomaly Detection
    Chadha, Gavneet Singh
    Islam, Intekhab
    Schwung, Andreas
    Ding, Steven X.
    SENSORS, 2021, 21 (16)
  • [10] Data Clustering-based Anomaly Detection in Industrial Control Systems
    Kiss, Istvan
    Genge, Bela
    Haller, Piroska
    Sebestyen, Gheorghe
    2014 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2014, : 275 - +