Correlation-based detection of attribute outliers

被引:0
|
作者
Koh, Judice L. Y. [1 ,2 ]
Lee, Mong Li [2 ]
Hsu, Wynne [2 ]
Lam, Kai Tak [3 ]
机构
[1] Inst Infocomm Res, Singapore 119613, Singapore
[2] Natl Univ Singapore, Sch Comput, Singapore 117548, Singapore
[3] Inst High Performance Comput, Singapore 117528, Singapore
来源
ADVANCES IN DATABASES: CONCEPTS, SYSTEMS AND APPLICATIONS | 2007年 / 4443卷
关键词
outlier detection; data cleaning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An outlier is an object that does not conform to the normal behavior of the data set. In data cleaning, outliers are identified for data noise reduction. In applications such as fraud detection, and stock market analysis, outliers suggest abnormal behavior requiring further investigation. Existing outlier detection methods have focused on class outliers and research on attribute outliers is limited, despite the equal role attribute outliers play in depreciating data quality and reducing data mining accuracy. In this paper, we propose a novel method to detect attribute outliers from the deviating correlation behavior of attributes. We formulate three metrics to evaluate outlier-ness of attributes, and introduce an adaptive factor to distinguish outliers from non-outliers. Experiments with both synthetic and real-world data sets indicate that the proposed method is effective in detecting attribute outliers.
引用
收藏
页码:164 / +
页数:2
相关论文
共 50 条
  • [1] Correlation-based attribute outlier detection in XML
    Koh, Judice L. Y.
    Lee, Mong Li
    Hsu, Wynne
    Ang, Wee Tiong
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1522 - +
  • [2] An integrated intrusion detection system using correlation-based attribute selection and artificial neural network
    Thaseen, I. Sumaiya
    Banu, J. Saira
    Lavanya, K.
    Ghalib, Muhammad Rukunuddin
    Abhishek, Kumar
    TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2021, 32 (02)
  • [3] A correlation-based approach to attribute selection in chemical graph mining
    Okada, Takashi
    NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2007, 3609 : 517 - 526
  • [4] Scale correlation-based edge detection
    Bao, P
    Lei, Z
    PROCEEDINGS VIPROMCOM-2002, 2002, : 345 - 350
  • [5] Regression based on Neural Incremental Attribute Learning with Correlation-based Feature Ordering
    Wang, Ting
    Zhu, Xiaoyan
    Guan, Sheng-Uei
    Man, Ka Lok
    Ting, T. O.
    PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 109 - 113
  • [6] Improved Attribute Reduction Algorithm Based on Rough Set and Correlation-based Analysis
    Luo, Hairui
    Yan, Jianzhuo
    Fang, Liying
    Wang, Hui
    Shi, Xinqing
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL AND INFORMATION SCIENCES (ICCIS 2014), 2014, : 1009 - 1015
  • [7] Categorical data clustering: A correlation-based approach for unsupervised attribute weighting
    Carbonera, Joel Luis
    Abel, Mara
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 259 - 263
  • [8] An improved correlation-based algorithm with discretization for attribute reduction in data clustering
    Kannan, S. Senthamarai
    Ramaraj, N.
    Data Science Journal, 2009, 8 : 125 - 138
  • [9] A correlation-based approach for event detection in Instagram
    dos Santos, Elder Donizetti
    Quiles, Marcos Goncalves
    Faria, Fabio Augusto
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 2971 - 2982
  • [10] On the performance of correlation-based packet detection techniques
    Recayte, Estefania
    Munari, Andrea
    PHYSICAL COMMUNICATION, 2023, 59