Unsupervised outlier detection in multidimensional data

被引:0
|
作者
Atiq ur Rehman
Samir Brahim Belhaouari
机构
[1] Hamad Bin Khalifa University,ICT Division, College of Science and Engineering
来源
关键词
Anomaly/outliers detection; Advanced statistical methods; Computationally inexpensive methods; High dimensional data;
D O I
暂无
中图分类号
学科分类号
摘要
Detection and removal of outliers in a dataset is a fundamental preprocessing task without which the analysis of the data can be misleading. Furthermore, the existence of anomalies in the data can heavily degrade the performance of machine learning algorithms. In order to detect the anomalies in a dataset in an unsupervised manner, some novel statistical techniques are proposed in this paper. The proposed techniques are based on statistical methods considering data compactness and other properties. The newly proposed ideas are found efficient in terms of performance, ease of implementation, and computational complexity. Furthermore, two proposed techniques presented in this paper use transformation of data to a unidimensional distance space to detect the outliers, so irrespective of the data’s high dimensions, the techniques remain computationally inexpensive and feasible. Comprehensive performance analysis of the proposed anomaly detection schemes is presented in the paper, and the newly proposed schemes are found better than the state-of-the-art methods when tested on several benchmark datasets.
引用
收藏
相关论文
共 50 条
  • [1] Unsupervised outlier detection in multidimensional data
    Ur Rehman, Atiq
    Belhaouari, Samir Brahim
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [2] An outlier ensemble for unsupervised anomaly detection in honeypots data
    Boukela, Lynda
    Zhang, Gongxuan
    Bouzefrane, Samia
    Zhou, Junlong
    INTELLIGENT DATA ANALYSIS, 2020, 24 (04) : 743 - 758
  • [3] Benchmarking Unsupervised Outlier Detection with Realistic Synthetic Data
    Steinbuss, Georg
    Boehm, Klemens
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2021, 15 (04)
  • [4] Unsupervised Outlier Detection Mechanism for Tea Traceability Data
    Yang, Honggang
    Li, Shaowen
    Tu, Lijing
    Ma, Rongrong
    Chen, Yin
    IEEE ACCESS, 2022, 10 : 94818 - 94831
  • [5] Unsupervised approach for online outlier detection in industrial process data
    Bechny, Michal
    Himmelbauer, Johannes
    3RD INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING, 2022, 200 : 257 - 266
  • [6] Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering
    Thakran, Yogita
    Toshniwal, Durga
    2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 947 - 952
  • [7] A survey on unsupervised subspace outlier detection methods for high dimensional data
    Ahn, Jaehyeong
    Kwon, Sunghoon
    KOREAN JOURNAL OF APPLIED STATISTICS, 2021, 34 (03) : 507 - 521
  • [8] RDPOD: an unsupervised approach for outlier detection
    Abhaya Abhaya
    Bidyut Kr. Patra
    Neural Computing and Applications, 2022, 34 : 1065 - 1077
  • [9] A new unsupervised outlier detection method
    Zheng, Lina
    Chen, Lijun
    Wang, Yini
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (01) : 1713 - 1734
  • [10] Internal Evaluation of Unsupervised Outlier Detection
    Marques, Henrique O.
    Campello, Ricardo J. G. B.
    Sander, Jorg
    Zimek, Arthur
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (04)