A new approach to discover interlacing data structures in high-dimensional space

被引:0
作者
Tao Ban
Changshui Zhang
Shigeo Abe
机构
[1] National Institute of Information and Communications Technology,Information Security Research Center
[2] Tsinghua University,Department of Automation
[3] Kobe University,Graduate School of Science and Technology
来源
Journal of Intelligent Information Systems | 2009年 / 33卷
关键词
Interlacing data structures; High-dimensional space; Real world dataset;
D O I
暂无
中图分类号
学科分类号
摘要
The discovery of structures hidden in high-dimensional data space is of great significance for understanding and further processing of the data. Real world datasets are often composed of multiple low dimensional patterns, the interlacement of which may impede our ability to understand the distribution rule of the data. Few of the existing methods focus on the detection and extraction of the manifolds representing distinct patterns. Inspired by the nonlinear dimensionality reduction method ISOmap, in this paper we present a novel approach called Multi-Manifold Partition to identify the interlacing low dimensional patterns. The algorithm has three steps: first a neighborhood graph is built to capture the intrinsic topological structure of the input data, then the dimensional uniformity of neighboring nodes is analyzed to discover the segments of patterns, finally the segments which are possibly from the same low-dimensional structure are combined to obtain a global representation of distribution rules. Experiments on synthetic data as well as real problems are reported. The results show that this new approach to exploratory data analysis is effective and may enhance our understanding of the data distribution.
引用
收藏
相关论文
共 27 条
[1]  
Bourlard H.(1988)Auto-association by multilayer perceptrons and singular value decomposition Biological Cybernetics 59 291-294
[2]  
Kamp Y.(1998)An algorithm for intrinsic dimensionality estimation IEEE Transactions on Pattern Analysis and Machine Intelligence 20 572-575
[3]  
Bruske J.(2002)Estimating the intrinsic dimension of data with a fractal-based method IEEE Transactions on Pattern Analysis and Machine Intelligence 24 1404-1407
[4]  
Sommer G.(1994)A robust stroke extraction method for handwritten Chinese characters International Journal of Pattern Recognition and Artificial Intelligence 8 1223-1239
[5]  
Camastra F.(2004)Geodesic entropic graphs for dimension and entropy estimation in manifold learning IEEE Transactions on Signal Processing 52 2210-2221
[6]  
Vinciarelli A.(1971)An algorithm for finding intrinsic dimensionality of data IEEE Transactions on Computers 20 176-183
[7]  
Chang H. D.(2000)Independent component analysis: Algorithms and applications Neural Networks 13 411-430
[8]  
Wang J. F.(1997)Dimension reduction by local principal component analysis Neural Computation 9 1493-1516
[9]  
Costa J.(1979)An intrinsic dimensionality estimator from near-neighbor information IEEE Transactions on Pattern Analysis and Machine Intelligence 1 25-37
[10]  
Hero A. O.(2000)Nonlinear dimensionality reduction by locally linear embedding Science 290 2323-2326