Soft dimensionality reduction for reinforcement data clustering

被引:0
|
作者
Fathinezhad, Fatemeh [1 ]
Adibi, Peyman [1 ]
Shoushtarian, Bijan [1 ]
Baradaran Kashani, Hamidreza [1 ]
Chanussot, Jocelyn [2 ]
机构
[1] Univ Isfahan, Fac Comp Engn, Artificial Intelligence Dept, Esfahan, Iran
[2] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France
基金
美国国家科学基金会;
关键词
Dimensionality reduction; Soft feature selection; Reinforcement learning; Data clustering; LAPLACIAN EIGENMAPS; FEATURE-SELECTION;
D O I
10.1007/s11280-023-01158-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The standard Euclidean distance considers equal contributions for all features of each data sample pair when computing the similarity matrix, while different features of real-world datasets have different importance. This paper proposes a new clustering method based on reinforcement learning and soft feature selection with three innovative ideas. First, a novel distance metric based on the importance of features is introduced which additionally can disappear irrelevant features approximately. Second, a new soft weighting mechanism is defined based on this distance to determine the effect of the neighborhood probability in the similarity matrix. Since the training data consists of noisy and redundant features, a sparsity regularization term is applied to solve this problem and emphasizes feature selection. Third, after these dimensionality reduction steps, a new clustering method is developed according to reinforcement learning, which considers the obtained low-dimensional data points as the states of the learning agents. It also uses different actions until convergence to transfer the worst points with the most scattering from one cluster to another one, to produce coherent clusters as well as make a balance between them. The proposed method is able to present high within-cluster consistencies. The experimental results on several real-world datasets show good performance and efficiency of the proposed method. Statistical analysis, parameter sensitivity analysis, and time complexity analysis, all confirm the appropriateness of the results obtained.
引用
收藏
页码:3027 / 3054
页数:28
相关论文
共 50 条
  • [1] Soft dimensionality reduction for reinforcement data clustering
    Fatemeh Fathinezhad
    Peyman Adibi
    Bijan Shoushtarian
    Hamidreza Baradaran Kashani
    Jocelyn Chanussot
    World Wide Web, 2023, 26 : 3027 - 3054
  • [2] Data dimensionality reduction technique for clustering problem of metabolomics data
    Rustam
    Gunawan, Agus Yodi
    Kresnowati, Made Tri Ari Penia
    HELIYON, 2022, 8 (06)
  • [3] Dimensionality Reduction for Clustering and Cluster Tracking of Cytometry Data
    Putri, Givanna H.
    Read, Mark N.
    Koprinska, Irena
    Ashhurst, Thomas M.
    King, Nicholas J. C.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: TEXT AND TIME SERIES, PT IV, 2019, 11730 : 624 - 640
  • [4] Dimensionality Reduction for Clustering of Nonlinear Industrial Data: A Tutorial
    Roh, Hae Rang
    Kim, Chae Sun
    Lee, Yongseok
    Lee, Jong Min
    KOREAN JOURNAL OF CHEMICAL ENGINEERING, 2025, : 987 - 1001
  • [5] Distributed dimensionality reduction of industrial data based on clustering
    Zhang, Yongyan
    Xie, Guo
    Wang, Wenqing
    Wang, Xiaofan
    Qian, Fucai
    Du, Xulong
    Du, Jinhua
    PROCEEDINGS OF THE 2018 13TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2018), 2018, : 370 - 374
  • [6] Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique
    Akmal, Saba
    Asif, Hafiz Muhammad Shahzad
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2021, 40 (03) : 630 - 644
  • [7] Reduction of Dimensionality in Structured Data Sets on Clustering Efficiency in Data Mining
    Pasha, Noor
    Ashokkumar, P. S.
    Venkatesh, P.
    Krishna, Gopal C.
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2017, : 1020 - 1023
  • [8] Denoising Autoencoder as an Effective Dimensionality Reduction and Clustering of Text Data
    Leyli-Abadi, Milad
    Labiod, Lazhar
    Nadif, Mohamed
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT II, 2017, 10235 : 801 - 813
  • [9] AdaCLV for interpretable variable clustering and dimensionality reduction of spectroscopic data
    Marion, Rebecca
    Govaerts, Bernadette
    von Sachs, Rainer
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 206
  • [10] Manifold Learning for Dimensionality Reduction and Clustering of Skin Spectroscopy Data
    Safi, Asad
    Castaneda, Victor
    Lasser, Tobias
    Mateus, Diana C.
    Navab, Nassir
    MEDICAL IMAGING 2011: COMPUTER-AIDED DIAGNOSIS, 2011, 7963