Soft dimensionality reduction for reinforcement data clustering

被引:0
作者
Fathinezhad, Fatemeh [1 ]
Adibi, Peyman [1 ]
Shoushtarian, Bijan [1 ]
Baradaran Kashani, Hamidreza [1 ]
Chanussot, Jocelyn [2 ]
机构
[1] Univ Isfahan, Fac Comp Engn, Artificial Intelligence Dept, Esfahan, Iran
[2] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2023年 / 26卷 / 05期
基金
美国国家科学基金会;
关键词
Dimensionality reduction; Soft feature selection; Reinforcement learning; Data clustering; LAPLACIAN EIGENMAPS; FEATURE-SELECTION;
D O I
10.1007/s11280-023-01158-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The standard Euclidean distance considers equal contributions for all features of each data sample pair when computing the similarity matrix, while different features of real-world datasets have different importance. This paper proposes a new clustering method based on reinforcement learning and soft feature selection with three innovative ideas. First, a novel distance metric based on the importance of features is introduced which additionally can disappear irrelevant features approximately. Second, a new soft weighting mechanism is defined based on this distance to determine the effect of the neighborhood probability in the similarity matrix. Since the training data consists of noisy and redundant features, a sparsity regularization term is applied to solve this problem and emphasizes feature selection. Third, after these dimensionality reduction steps, a new clustering method is developed according to reinforcement learning, which considers the obtained low-dimensional data points as the states of the learning agents. It also uses different actions until convergence to transfer the worst points with the most scattering from one cluster to another one, to produce coherent clusters as well as make a balance between them. The proposed method is able to present high within-cluster consistencies. The experimental results on several real-world datasets show good performance and efficiency of the proposed method. Statistical analysis, parameter sensitivity analysis, and time complexity analysis, all confirm the appropriateness of the results obtained.
引用
收藏
页码:3027 / 3054
页数:28
相关论文
共 55 条
  • [1] Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering
    Abualigah, Laith Mohammad
    Khader, Ahamad Tajudin
    Al-Betar, Mohammed Azmi
    Alomari, Osama Ahmad
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 84 : 24 - 36
  • [2] Barbakh W., 2007, CLUSTERING REINFORCE, P507
  • [3] Barbakh W, 2007, LECT NOTES COMPUT SC, V4881, P507
  • [4] Geometric Multimodal Learning Based on Local Signal Expansion for Joint Diagonalization
    Behmanesh, Maysam
    Adibi, Peyman
    Chanussot, Jocelyn
    Jutten, Christian
    Ehsani, Sayyed Mohammad Saeed
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 1271 - 1286
  • [5] Laplacian eigenmaps for dimensionality reduction and data representation
    Belkin, M
    Niyogi, P
    [J]. NEURAL COMPUTATION, 2003, 15 (06) : 1373 - 1396
  • [6] Boyd S., 2004, CONVEX OPTIMIZATION, DOI [DOI 10.1017/CBO9780511804441, 10.1017/CBO9780511804441]
  • [7] Cai D., 2010, P 16 ACM SIGKDD INT, P333, DOI DOI 10.1145/1835804.1835848
  • [8] Towards Social Interaction between 1st and 2nd Person Perspectives on Bodily Play
    Chen, Bo-Han
    Wong, Sai-Keung
    Chang, Wei-Che
    Fan, Roy Ping-Hao
    [J]. ADJUNCT PROCEEDINGS OF THE 34TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, UIST 2021, 2021, : 1 - 3
  • [9] Chormunge S., 2016, INT J ELECT COMPUTER, V6
  • [10] A survey on parallel clustering algorithms for Big Data
    Dafir, Zineb
    Lamari, Yasmine
    Slaoui, Said Chah
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (04) : 2411 - 2443