Outlier mining based on Variance of Angle technology research in High-Dimensional Data

被引:1
作者
Liu, Wenting [1 ]
Pan, Ruikai [2 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing, Jiangsu, Peoples R China
[2] Xinhua News Agcy, Xinhua Daily Press Grp, Nanjing, Jiangsu, Peoples R China
来源
2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE) | 2015年
关键词
outlier; high dimensional data; variance;
D O I
10.1109/ISKE.2015.64
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier mining in high dimensional data is currently one of the hot areas of data mining. The existing outlier mining methods are based on the distance in the full-dimensional Euclidean space. In high-dimensional data, these methods are bound to deteriorate due to the notorious dimension disasterwhich leads to distance measure can not express the original physical meaning and the low computational efficiency. This paper improves the method of angle-based outlier factor outlier and proposes the method of variance of angle-based outlier factor outlier. It introduces the related theories to guarantee the reliability of the method. The empirical experiments on synthetic data sets show that the method is efficient and scalable to large high-dimensional data sets.
引用
收藏
页码:598 / 603
页数:6
相关论文
共 50 条
[21]   High-dimensional outlier detection using random projections [J].
Navarro-Esteban, P. ;
Cuesta-Albertos, J. A. .
TEST, 2021, 30 (04) :908-934
[22]   Outlier-Robust PCA: The High-Dimensional Case [J].
Xu, Huan ;
Caramanis, Constantine ;
Mannor, Shie .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2013, 59 (01) :546-572
[23]   High-dimensional outlier detection using random projections [J].
P. Navarro-Esteban ;
J. A. Cuesta-Albertos .
TEST, 2021, 30 :908-934
[24]   GIBBS POSTERIOR FOR VARIABLE SELECTION IN HIGH-DIMENSIONAL CLASSIFICATION AND DATA MINING [J].
Jiang, Wenxin ;
Tanner, Martin A. .
ANNALS OF STATISTICS, 2008, 36 (05) :2207-2231
[25]   Outlier Detection in High Dimensional Data [J].
Kamalov, Firuz ;
Leung, Ho Hon .
JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2020, 19 (01)
[26]   Research on high-dimensional space control of microgrid voltage data based on chaos theory [J].
Wang, Jian ;
Zhao, Qingshan ;
He, Guoping ;
Hao, Yaojun .
SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2022, 49
[27]   Clustering algorithm of high-dimensional data based on units [J].
School of In formation Engineering, Hubei Institute for Nationalities, Enshi 445000, China .
Jisuanji Yanjiu yu Fazhan, 2007, 9 (1618-1623) :1618-1623
[28]   Example-based robust DB-Outlier detection for high dimensional data [J].
Li, Yuan ;
Kitagawa, Hiroyuki .
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2008, 4947 :330-+
[29]   Stimulation spectrum based high-dimensional data visualization [J].
Liu, Kan ;
Liu, Ping ;
Jin, Dawei .
INFORMATION VISUALIZATION-BOOK, 2006, :721-+
[30]   High-Dimensional Data Visualization Based on User Knowledge [J].
Liu, Qiaolian ;
Zhao, Jianfei ;
Guo, Naiwang ;
Xiao, Ding ;
Shi, Chuan .
DATA MINING AND BIG DATA, DMBD 2016, 2016, 9714 :321-329