Local nonlinear dimensionality reduction via preserving the geometric structure of data

被引:8
作者
Wang, Xiang [1 ,2 ]
Zhu, Junxing [1 ]
Xu, Zichen [3 ]
Ren, Kaijun [1 ,2 ]
Liu, Xinwang [2 ]
Wang, Fengyun [4 ]
机构
[1] Natl Univ Def Technol, Coll Meteorol & Oceanog, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Comp Sci & Technol, Changsha 410073, Hunan, Peoples R China
[3] Nanchang Univ, Coll Comp Sci & Technol, Nanchang 330000, Jiangxi, Peoples R China
[4] Univ Leeds, Sch Comp, Leeds LS2 9JT, West Yorkshire, England
关键词
Dimensionality reduction; Embedding learning; Geometric preservation; Random walk;
D O I
10.1016/j.patcog.2023.109663
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dimensionality reduction has many applications in data visualization and machine learning. Existing methods can be classified into global ones and local ones. The global methods usually learn the linear relationship in data, while the local ones learn the manifold intrinsic geometry structure, which has a significant impact on pattern recognition. However, most of existing local methods obtain an embedding with eigenvalue or singular value decomposition, where the computational complexities are very high in a large amount of high-dimensional data. In this paper, we propose a local nonlinear dimensionality reduction method named Vec2vec , which employs a neural network with only one hidden layer to reduce the computational complexity. We first build a neighborhood similarity graph from the input matrix, and then define the context of data points with the random walk properties in the graph. Finally, we train the neural network with the context of data points to learn the embedding of the matrix. We conduct extensive experiments of data classification and clustering on nine image and text datasets to evaluate the performance of our method. Experimental results show that Vec2vec is better than several state-of-the-art dimensionality reduction methods, except that it is equivalent to UMAP on data clustering tasks in the statistical hypothesis tests, but Vec2vec needs less computational time than UMAP in high-dimensional data. Furthermore, we propose a more lightweight method named Approximate Vec2vec (AVec2vec) with little performance degradation, which employs an approximate method to build the neighborhood similarity graph. AVec2vec is still better than some state-of-the-art local dimensionality reduction methods and competitive with UMAP on data classification and clustering tasks in the statistical hypothesis tests.& COPY; 2023 Published by Elsevier Ltd.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Dimensionality reduction for data of unknown cluster structure
    Nowakowska, Ewa
    Koronacki, Jacek
    Lipovetsky, Stan
    INFORMATION SCIENCES, 2016, 330 : 74 - 87
  • [32] Category Guided Sparse Preserving Projection for Biometric Data Dimensionality Reduction
    Huang, Qianying
    Wu, Yunsong
    Zhao, Chenqiu
    Zhang, Xiaohong
    Yang, Dan
    Biometric Recognition, 2016, 9967 : 539 - 546
  • [33] Global structure-guided neighborhood preserving embedding for dimensionality reduction
    Can Gao
    Yong Li
    Jie Zhou
    Witold Pedrycz
    Zhihui Lai
    Jun Wan
    Jianglin Lu
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 2013 - 2032
  • [34] Structure preserving non-negative matrix factorization for dimensionality reduction
    Li, Zechao
    Liu, Jing
    Lu, Hanqing
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (09) : 1175 - 1189
  • [35] Global structure-guided neighborhood preserving embedding for dimensionality reduction
    Gao, Can
    Li, Yong
    Zhou, Jie
    Pedrycz, Witold
    Lai, Zhihui
    Wan, Jun
    Lu, Jianglin
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (07) : 2013 - 2032
  • [36] Local dimensionality reduction
    David J. Marchette
    Wendy L. Poston
    Computational Statistics, 1999, 14 : 469 - 489
  • [37] Local dimensionality reduction
    Marchette, DJ
    Poston, WL
    COMPUTATIONAL STATISTICS, 1999, 14 (04) : 469 - 489
  • [38] Nonlinear supervised dimensionality reduction via smooth regular embeddings
    Ornek, Cem
    Vural, Elif
    PATTERN RECOGNITION, 2019, 87 : 55 - 66
  • [39] Dimensionality reduction via adjusting data distribution density
    Wang, Wei
    Shen, Wei-guo
    Sun, Ya-xin
    Chen, Bin
    Zhu, Rong
    2018 5TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2018, : 1052 - 1055
  • [40] Semi-supervised dimensionality reduction via sparse locality preserving projection
    Guo, Huijie
    Zou, Hui
    Tan, Junyan
    APPLIED INTELLIGENCE, 2020, 50 (04) : 1222 - 1232