Local Intrinsic Dimensionality II: Multivariate Analysis and Distributional

被引:28
作者
Houle, Michael E. [1 ]
机构
[1] Natl Inst Informat, Chiyoda Ku, 2-1-2 Hitotsubashi, Tokyo 1018430, Japan
来源
SIMILARITY SEARCH AND APPLICATIONS, SISAP 2017 | 2017年 / 10609卷
关键词
SIMILARITY SEARCH;
D O I
10.1007/978-3-319-68474-1_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distance-based expansion models of intrinsic dimensionality have had recent application in the analysis of complexity of similarity applications, and in the design of efficient heuristics. This theory paper extends one such model, the local intrinsic dimension (LID), to a multivariate form that can account for the contributions of different distributional components towards the intrinsic dimensionality of the entire feature set, or equivalently towards the discriminability of distance measures defined in terms of these feature combinations. Formulas are established for the effect on LID under summation, product, composition, and convolution operations on smooth functions in general, and cumulative distribution functions in particular. For some of these operations, the dimensional or discriminability characteristics of the result are also shown to depend on a form of distributional support. As an example, an analysis is provided that quantifies the impact of introduced random Gaussian noise on the intrinsic dimension of data. Finally, a theoretical relationship is established between the LID model and the classical correlation dimension.
引用
收藏
页码:80 / 95
页数:16
相关论文
共 27 条
[11]  
He X., 2005, P 18 INT C NEUR INF, P507
[12]   SIMPLE GENERAL APPROACH TO INFERENCE ABOUT TAIL OF A DISTRIBUTION [J].
HILL, BM .
ANNALS OF STATISTICS, 1975, 3 (05) :1163-1174
[13]  
Houle M.E., 2017, SISAP, P1
[14]   Efficient similarity search within user-specified projective subspaces [J].
Houle, Michael E. ;
Ma, Xiguo ;
Oria, Vincent ;
Sun, Jichao .
INFORMATION SYSTEMS, 2016, 59 :2-14
[15]   Rank-Based Similarity Search: Reducing the Dimensional Dependence [J].
Houle, Michael E. ;
Nett, Michael .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (01) :136-150
[16]   Dimensionality, Discriminability, Density & Distance Distributions [J].
Houle, Michael E. .
2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, :468-473
[17]   Generalized Expansion Dimension [J].
Houle, Michael E. ;
Kashima, Hisashi ;
Nett, Michael .
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, :587-594
[18]   Dimensional Testing for Multi-Step Similarity Search [J].
Houle, Michael E. ;
Ma, Xiguo ;
Nett, Michael ;
Oria, Vincent .
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, :299-308
[19]  
Karger D.R., 2002, Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing, P741
[20]   ON RIGOROUS MATHEMATICAL DEFINITIONS OF CORRELATION DIMENSION AND GENERALIZED SPECTRUM FOR DIMENSIONS [J].
PESIN, YB .
JOURNAL OF STATISTICAL PHYSICS, 1993, 71 (3-4) :529-547