Proposing a Dimensionality Reduction Technique With an Inequality for Unsupervised Learning from High-Dimensional Big Data

被引:0
作者
Ismkhan, Hassan [1 ]
Izadi, Mohammad [1 ]
机构
[1] Sharif Univ Technol, Fac Comp Engn, Tehran 1458889694, Iran
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2023年 / 53卷 / 06期
关键词
Clustering algorithms; Task analysis; Feature extraction; Unsupervised learning; Dimensionality reduction; Transforms; Standards; Big data; dimensionality reduction (DR); high-dimensional data; k-means; nearest neighbor (NN); K-MEANS;
D O I
10.1109/TSMC.2023.3234227
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN -r), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. This article proposes a simple dimensionality reduction (DR) technique. For point p in d-dimensional space, it produces point p' in d'-dimensional space, where d' << d. In addition, for any pair of points p and q, and their maps p' and q' in the target space, it is proved that |p, q| > |p', q'| is preserved, where |, | used to denote the Euclidean distance between a pair of points. This property can speed up finding NN -r. For a certain radius r, and a pair of points p and q, whenever |p', q'| > r, then q can not be in NN -r of p. Using this trick, the task of finding the NN -r is speeded up. Then, as a case study, it is applied to accelerate the k-means, one of the most famous unsupervised learning algorithms, where it can automatically determine the d'. The proposed NN -r method and the accelerated k-means are compared with recent state-of-the-arts, and both yield favorable results.
引用
收藏
页码:3880 / 3889
页数:10
相关论文
共 50 条
  • [1] ResNet Autoencoders for Unsupervised Feature Learning From High-Dimensional Data: Deep Models Resistant to Performance Degradation
    Wickramasinghe, Chathurika S.
    Marino, Daniel L.
    Manic, Milos
    IEEE ACCESS, 2021, 9 : 40511 - 40520
  • [2] A hybrid dimensionality reduction method for outlier detection in high-dimensional data
    Meng, Guanglei
    Wang, Biao
    Wu, Yanming
    Zhou, Mingzhe
    Meng, Tiankuo
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (11) : 3705 - 3718
  • [3] Hybrid Dimensionality Reduction Forest With Pruning for High-Dimensional Data Classification
    Chen, Weihong
    Xu, Yuhong
    Yu, Zhiwen
    Cao, Wenming
    Chen, C. L. Philip
    Han, Guoqiang
    IEEE ACCESS, 2020, 8 : 40138 - 40150
  • [4] Flexible High-Dimensional Unsupervised Learning with Missing Data
    Wei, Yuhong
    Tang, Yang
    McNicholas, Paul D.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (03) : 610 - 621
  • [5] Dimensionality Reduction for Registration of High-Dimensional Data Sets
    Xu, Min
    Chen, Hao
    Varshney, Pramod K.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (08) : 3041 - 3049
  • [6] A hybrid dimensionality reduction method for outlier detection in high-dimensional data
    Guanglei Meng
    Biao Wang
    Yanming Wu
    Mingzhe Zhou
    Tiankuo Meng
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 3705 - 3718
  • [7] Comparing and Exploring High-Dimensional Data with Dimensionality Reduction Algorithms and Matrix Visualizations
    Cutura, Rene
    Aupetit, Michael
    Fekete, Jean-Daniel
    Sedlmair, Michael
    PROCEEDINGS OF THE WORKING CONFERENCE ON ADVANCED VISUAL INTERFACES AVI 2020, 2020,
  • [8] Self-taught dimensionality reduction on the high-dimensional small-sized data
    Zhu, Xiaofeng
    Huang, Zi
    Yang, Yang
    Shen, Heng Tao
    Xu, Changsheng
    Luo, Jiebo
    PATTERN RECOGNITION, 2013, 46 (01) : 215 - 229
  • [9] Dependence maps, a dimensionality reduction with dependence distance for high-dimensional data
    Lee, Kichun
    Gray, Alexander
    Kim, Heeyoung
    DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 26 (03) : 512 - 532
  • [10] An Optimized Dimensionality Reduction Model for High-dimensional Data Based on Restricted Boltzmann Machines
    Zhang, Ke
    Liu, Jianhuan
    Chai, Yi
    Qian, Kun
    2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 2963 - 2968