A Novel Density-Based Clustering Approach for Outlier Detection in High-Dimensional Data

被引:2
作者
Messaoud, Thouraya Aouled [1 ]
Smiti, Abir [2 ]
Louati, Aymen [1 ]
机构
[1] Univ Jendouba, Inst Super Informat Kef, Jendouba, Tunisia
[2] Inst Super Gest Tunis, LARODEC, Tunis, Tunisia
来源
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019 | 2019年 / 11734卷
关键词
Outliers; Feature selection; Clustering; DBSCAN;
D O I
10.1007/978-3-030-29859-3_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier detection is a primary aspect in data-mining and machine learning applications, also known as outlier mining. The importance of outlier detection in medical data came from the fact that outliers may carry some precious information however outlier detection can show very bad performance in the presence of high dimensional data. In this paper, a new outlier detection technique is proposed based on a feature selection strategy to avoid the curse of dimensionality, named Infinite Feature Selection DBSCAN. The main purpose of our proposed method is to reduce the dimensions of a high dimensional data set in order to efficiently identify outliers using clustering techniques. Simulations on real databases proved the effectiveness of our method taking into account the accuracy, the error-rate, F-score and the retrieval time of the algorithm.
引用
收藏
页码:322 / 331
页数:10
相关论文
共 12 条
  • [1] [Anonymous], 2000, P 5 INT WORKSH INT D
  • [2] LOF: Identifying density-based local outliers
    Breunig, MM
    Kriegel, HP
    Ng, RT
    Sander, J
    [J]. SIGMOD RECORD, 2000, 29 (02) : 93 - 104
  • [3] Ester M., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P226
  • [4] Goldstein S, 2012, FRONT COLLECT, P59, DOI 10.1007/978-3-642-21329-8_4
  • [5] A novel outlier cluster detection algorithm without top-n parameter
    Huang, Jinlong
    Zhu, Qingsheng
    Yang, Lijun
    Cheng, DongDong
    Wu, Quanwang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 121 : 32 - 40
  • [6] Jin W, 2006, LECT NOTES ARTIF INT, V3918, P577
  • [7] Kriegel M., 2008, P 14 ACM SIGKDD INT, P444, DOI DOI 10.1145/1401890.1401946
  • [8] Qi XT, 2016, 2016 2ND INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), P114, DOI [10.1109/ICIICII.2016.83, 10.1109/ICIICII.2016.0038]
  • [9] Infinite Feature Selection
    Roffo, Giorgio
    Melzi, Simone
    Cristani, Marco
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4202 - 4210
  • [10] Smiti A., 2011, Proceedings of the 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA), P356, DOI 10.1109/ISDA.2011.6121681