Traditional outlier detections are inadequate for high-dimensional data analysis due to the interference of distance tending to be concentrated ("curse of dimensionality"). Inspired by the Coulomb's law, we propose a new high-dimensional data similarity measure vector, which consists of outlier Coulomb force and outlier Coulomb resultant force. Outlier Coulomb force not only effectively gauges similarity measures among data objects, but also fully reflects differences among dimensions of data objects by vector projection in each dimension. More importantly, Coulomb resultant force can effectively measure deviations of data objects from a data center, making detection results interpretable. We introduce a new neighborhood outlier factor, which drives the development of a high-dimensional outlier detection algorithm. In our approach, attribute values with a high deviation degree is treated as interpretable information of outlier data. Finally, we implement and evaluate our algorithm using the UCI and synthetic datasets. Our experimental results show that the algorithm effectively alleviates the interference of "Curse of Dimensionality". The findings confirm that high-dimensional outlier data originated by the algorithm are interpretable.
机构:
Italian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, ItalyItalian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, Italy
Angiulli, F
Basta, S
论文数: 0引用数: 0
h-index: 0
机构:
Italian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, ItalyItalian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, Italy
Basta, S
Pizzuti, C
论文数: 0引用数: 0
h-index: 0
机构:
Italian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, ItalyItalian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, Italy
机构:
Italian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, ItalyItalian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, Italy
Angiulli, F
Basta, S
论文数: 0引用数: 0
h-index: 0
机构:
Italian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, ItalyItalian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, Italy
Basta, S
Pizzuti, C
论文数: 0引用数: 0
h-index: 0
机构:
Italian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, ItalyItalian Natl Res Council, Inst High Performance Comp & Networking, I-87036 Arcavacata Di Rende, CS, Italy