Proposing a Dimensionality Reduction Technique With an Inequality for Unsupervised Learning from High-Dimensional Big Data

被引:0
作者
Ismkhan, Hassan [1 ]
Izadi, Mohammad [1 ]
机构
[1] Sharif Univ Technol, Fac Comp Engn, Tehran 1458889694, Iran
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2023年 / 53卷 / 06期
关键词
Clustering algorithms; Task analysis; Feature extraction; Unsupervised learning; Dimensionality reduction; Transforms; Standards; Big data; dimensionality reduction (DR); high-dimensional data; k-means; nearest neighbor (NN); K-MEANS;
D O I
10.1109/TSMC.2023.3234227
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN -r), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. This article proposes a simple dimensionality reduction (DR) technique. For point p in d-dimensional space, it produces point p' in d'-dimensional space, where d' << d. In addition, for any pair of points p and q, and their maps p' and q' in the target space, it is proved that |p, q| > |p', q'| is preserved, where |, | used to denote the Euclidean distance between a pair of points. This property can speed up finding NN -r. For a certain radius r, and a pair of points p and q, whenever |p', q'| > r, then q can not be in NN -r of p. Using this trick, the task of finding the NN -r is speeded up. Then, as a case study, it is applied to accelerate the k-means, one of the most famous unsupervised learning algorithms, where it can automatically determine the d'. The proposed NN -r method and the accelerated k-means are compared with recent state-of-the-arts, and both yield favorable results.
引用
收藏
页码:3880 / 3889
页数:10
相关论文
共 50 条
  • [41] Dimensionality Reduction Evolutionary Framework for Solving High-Dimensional Expensive Problems
    Song, Wei
    Zou, Fucai
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (09) : 607 - 616
  • [42] Dimensionality reduction for similarity search with the Euclidean distance in high-dimensional applications
    Seungdo Jeong
    Sang-Wook Kim
    Byung-Uk Choi
    Multimedia Tools and Applications, 2009, 42 : 251 - 271
  • [43] Dimensionality reduction for similarity search with the Euclidean distance in high-dimensional applications
    Jeong, Seungdo
    Kim, Sang-Wook
    Choi, Byung-Uk
    MULTIMEDIA TOOLS AND APPLICATIONS, 2009, 42 (02) : 251 - 271
  • [44] Incremental Classification for High-Dimensional EEG Manifold Representation Using Bidirectional Dimensionality Reduction and Prototype Learning
    Liu, Dongxu
    Ding, Qichuan
    Tong, Chenyu
    Ai, Jinshuo
    Wang, Fei
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2025, 29 (02) : 984 - 995
  • [45] Learning high-dimensional multimedia data
    Zhu, Xiaofeng
    Jin, Zhi
    Ji, Rongrong
    MULTIMEDIA SYSTEMS, 2017, 23 (03) : 281 - 283
  • [46] Reconstruction and Decomposition of High-Dimensional Landscapes via Unsupervised Learning
    Lei, Jing
    Akhter, Nasrin
    Qiao, Wanli
    Shehu, Amarda
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2505 - 2513
  • [47] Hybrid fast unsupervised feature selection for high-dimensional data
    Manbari, Zhaleh
    AkhlaghianTab, Fardin
    Salavati, Chiman
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 124 : 97 - 118
  • [48] Orthogonal Mixed-Effects Modeling for High-Dimensional Longitudinal Data: An Unsupervised Learning Approach
    Chen, Ming
    Bian, Yijun
    Chen, Nanguang
    Qiu, Anqi
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (01) : 207 - 220
  • [49] A Positive Region-based Dimensionality Reduction from High Dimensional data
    Dai Zhe
    Liu Jianhui
    2015 8TH INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI), 2015, : 624 - 628
  • [50] Clustering High-Dimensional Data: A Reduction-Level Fusion of PCA and Random Projection
    Pasunuri, Raghunadh
    Venkaiah, Vadlamudi China
    Srivastava, Amit
    RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS, 2019, 740 : 479 - 487