Soft dimensionality reduction for reinforcement data clustering

被引：0

作者：

Fathinezhad, Fatemeh ^{[1
]}

Adibi, Peyman ^{[1
]}

Shoushtarian, Bijan ^{[1
]}

Baradaran Kashani, Hamidreza ^{[1
]}

Chanussot, Jocelyn ^{[2
]}

机构：

[1] Univ Isfahan, Fac Comp Engn, Artificial Intelligence Dept, Esfahan, Iran

[2] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France

来源：

WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2023年 / 26卷 / 05期

基金：

美国国家科学基金会;

关键词：

Dimensionality reduction; Soft feature selection; Reinforcement learning; Data clustering; LAPLACIAN EIGENMAPS; FEATURE-SELECTION;

D O I：

10.1007/s11280-023-01158-y

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The standard Euclidean distance considers equal contributions for all features of each data sample pair when computing the similarity matrix, while different features of real-world datasets have different importance. This paper proposes a new clustering method based on reinforcement learning and soft feature selection with three innovative ideas. First, a novel distance metric based on the importance of features is introduced which additionally can disappear irrelevant features approximately. Second, a new soft weighting mechanism is defined based on this distance to determine the effect of the neighborhood probability in the similarity matrix. Since the training data consists of noisy and redundant features, a sparsity regularization term is applied to solve this problem and emphasizes feature selection. Third, after these dimensionality reduction steps, a new clustering method is developed according to reinforcement learning, which considers the obtained low-dimensional data points as the states of the learning agents. It also uses different actions until convergence to transfer the worst points with the most scattering from one cluster to another one, to produce coherent clusters as well as make a balance between them. The proposed method is able to present high within-cluster consistencies. The experimental results on several real-world datasets show good performance and efficiency of the proposed method. Statistical analysis, parameter sensitivity analysis, and time complexity analysis, all confirm the appropriateness of the results obtained.

引用

页码：3027 / 3054

页数：28

共 55 条

[1] Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering [J].

Abualigah, Laith Mohammad ;

Khader, Ahamad Tajudin ;

Al-Betar, Mohammed Azmi ;

Alomari, Osama Ahmad .

EXPERT SYSTEMS WITH APPLICATIONS, 2017, 84 :24-36

[2]

Barbakh W., 2007, CLUSTERING REINFORCE, P507

[3]

Barbakh W, 2007, LECT NOTES COMPUT SC, V4881, P507

[4] Geometric Multimodal Learning Based on Local Signal Expansion for Joint Diagonalization [J].

Behmanesh, Maysam ;

Adibi, Peyman ;

Chanussot, Jocelyn ;

Jutten, Christian ;

Ehsani, Sayyed Mohammad Saeed .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 :1271-1286

[5] Laplacian eigenmaps for dimensionality reduction and data representation [J].

Belkin, M ;

Niyogi, P .

NEURAL COMPUTATION, 2003, 15 (06) :1373-1396

[6]

Boyd SP., 2004, Convex optimization, DOI 10.1017/CBO9780511804441

[7]

Cai D., 2010, P 16 ACM SIGKDD INT, P333

[8] Towards Social Interaction between 1st and 2nd Person Perspectives on Bodily Play [J].

Chen, Bo-Han ;

Wong, Sai-Keung ;

Chang, Wei-Che ;

Fan, Roy Ping-Hao .

ADJUNCT PROCEEDINGS OF THE 34TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, UIST 2021, 2021, :1-3

[9]

Chormunge S., 2016, INT J ELECT COMPUTER, V6

[10] A survey on parallel clustering algorithms for Big Data [J].

Dafir, Zineb ;

Lamari, Yasmine ;

Slaoui, Said Chah .

ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (04) :2411-2443

← 1 2 3 4 5 6 →