Local nonlinear dimensionality reduction via preserving the geometric structure of data

被引:8
作者
Wang, Xiang [1 ,2 ]
Zhu, Junxing [1 ]
Xu, Zichen [3 ]
Ren, Kaijun [1 ,2 ]
Liu, Xinwang [2 ]
Wang, Fengyun [4 ]
机构
[1] Natl Univ Def Technol, Coll Meteorol & Oceanog, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Comp Sci & Technol, Changsha 410073, Hunan, Peoples R China
[3] Nanchang Univ, Coll Comp Sci & Technol, Nanchang 330000, Jiangxi, Peoples R China
[4] Univ Leeds, Sch Comp, Leeds LS2 9JT, West Yorkshire, England
关键词
Dimensionality reduction; Embedding learning; Geometric preservation; Random walk;
D O I
10.1016/j.patcog.2023.109663
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dimensionality reduction has many applications in data visualization and machine learning. Existing methods can be classified into global ones and local ones. The global methods usually learn the linear relationship in data, while the local ones learn the manifold intrinsic geometry structure, which has a significant impact on pattern recognition. However, most of existing local methods obtain an embedding with eigenvalue or singular value decomposition, where the computational complexities are very high in a large amount of high-dimensional data. In this paper, we propose a local nonlinear dimensionality reduction method named Vec2vec , which employs a neural network with only one hidden layer to reduce the computational complexity. We first build a neighborhood similarity graph from the input matrix, and then define the context of data points with the random walk properties in the graph. Finally, we train the neural network with the context of data points to learn the embedding of the matrix. We conduct extensive experiments of data classification and clustering on nine image and text datasets to evaluate the performance of our method. Experimental results show that Vec2vec is better than several state-of-the-art dimensionality reduction methods, except that it is equivalent to UMAP on data clustering tasks in the statistical hypothesis tests, but Vec2vec needs less computational time than UMAP in high-dimensional data. Furthermore, we propose a more lightweight method named Approximate Vec2vec (AVec2vec) with little performance degradation, which employs an approximate method to build the neighborhood similarity graph. AVec2vec is still better than some state-of-the-art local dimensionality reduction methods and competitive with UMAP on data classification and clustering tasks in the statistical hypothesis tests.& COPY; 2023 Published by Elsevier Ltd.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Semi-supervised dimensionality reduction via sparse locality preserving projection
    Huijie Guo
    Hui Zou
    Junyan Tan
    Applied Intelligence, 2020, 50 : 1222 - 1232
  • [42] A nonlinear method for dimensionality reduction of data using reference nodes
    E. V. Myasnikov
    Pattern Recognition and Image Analysis, 2012, 22 (2) : 337 - 345
  • [43] Nonlinear Dimensionality Reduction for Low Data Regimes in Photonics Design
    Grinberg, Yuri
    Al-Digeil, Muhammad
    Melati, Daniele
    Dezfouli, Mohsen Kamandar
    Schmid, Jens H.
    Cheben, Pavel
    Janz, Siegfried
    Xu, Danxia
    2022 PHOTONICS NORTH (PN), 2022,
  • [44] SPATIALLY AWARE SUPERVISED NONLINEAR DIMENSIONALITY REDUCTION FOR HYPERSPECTRAL DATA
    Volpi, Michele
    Tuia, Devis
    2014 6TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2014,
  • [45] A Framework for Local Supervised Dimensionality Reduction of High Dimensional Data
    Aggarwal, Charu C.
    PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 360 - 371
  • [46] Geometry and statistics-preserving manifold emb e dding for nonlinear dimensionality reduction
    Islam, Md Tauhidul
    Xing, Lei
    PATTERN RECOGNITION LETTERS, 2021, 151 : 155 - 162
  • [47] Local and Global Preserving Semisupervised Dimensionality Reduction Based on Random Subspace for Cancer Classification
    Cai, Xianfa
    Wei, Jia
    Wen, Guihua
    Yu, Zhiwen
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2014, 18 (02) : 500 - 507
  • [48] RETRACTED: Nonlinear Spatial Data Dimensionality Reduction strategy based on Local Linear hypothesis for WSN (Retracted Article)
    Song, Xin
    Liu, Kui
    Wang, Cuirong
    2011 INTERNATIONAL CONFERENCE ON ENERGY AND ENVIRONMENTAL SCIENCE-ICEES 2011, 2011, 11 : 3523 - 3529
  • [49] Nonlinear dimensionality reduction for clustering
    Tasoulis, Sotiris
    Pavlidis, Nicos G.
    Roos, Teemu
    PATTERN RECOGNITION, 2020, 107 (107)
  • [50] Nonlinear Dimensionality Reduction on Graphs
    Shen, Yanning
    Traganitis, Panagiotis A.
    Giannakis, Georgios B.
    2017 IEEE 7TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP), 2017,