Joint long and short span self-attention network for multi-view classification

被引：1

作者：

Chen, Zhikui ^{[1
]}

Lou, Kai ^{[1
]}

Liu, Zhenjiao ^{[1
]}

Li, Yue ^{[1
]}

Luo, Yiming ^{[1
]}

Zhao, Liang ^{[1
]}

机构：

[1] Dalian Univ Technol, Sch Software, Dalian 116620, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 235卷

关键词：

Multi-view classification; Self-attention mechanism; Multi-view fusion; DIMENSIONALITY; MODEL;

D O I：

10.1016/j.eswa.2023.121152

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-view classification aims to efficiently utilize information from different views to improve classification performance. In recent researches, many effective multi-view learning methods have been proposed to perform multi-view data analysis. However, most existing methods only consider the correlations between views but ignore the potential correlations between samples. Normally, the views of samples belonging to the same category should have more consistency information and those belonging to different categories should have more distinctions. Therefore, we argue that the correlations and distinctions between the views of different samples also contribute to the construction of feature representations that are more conducive to classification. In order to construct a end-to-end general multi-view classification framework that can better utilize sample information to obtain more reasonable feature representation, we propose a novel joint long and short span self -attention network (JLSSAN). We designed two different self-attention spans to focus on different information. They enable each feature vector to be iteratively updated based on its attention to other views and other samples, which provides better integration of information from different views and different samples. Besides, we adopt a novel weight-based loss fusion strategy, which facilitates the model to learn more reasonable self-attention map between views. Our method outperforms the state-of-the-art methods by more than 3% in accuracy on multiple benchmarks, which demonstrates that our method is effective.

引用

页数：10

共 47 条

[1] Akaho S., 2001, PROC INT M PSYCHOM
[2] Andrienko G., 2013, Introduction, P1
[3] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[4] A novel multi-view clustering approach via proximity-based factorization targeting structural maintenance and sparsity challenges for text and image categorization
Bansal, Monika
Sharma, Dolly
[J]. INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (04)
[5] Chapman J., 2021, J. Open Source Softw, V6, P3823, DOI [DOI 10.21105/JOSS.03823, 10.21105/joss.03823]
[6] Blessing of Dimensionality: High-dimensional Feature and Its Efficient Compression for Face Verification
Chen, Dong
Cao, Xudong
Wen, Fang
Sun, Jian
[J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3025 - 3032
[7] Low-Rank Tensor Based Proximity Learning for Multi-View Clustering
Chen, Man-Sheng
Wang, Chang-Dong
Lai, Jian-Huang
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 5076 - 5090
[8] Multi-View 3D Object Detection Network for Autonomous Driving
Chen, Xiaozhi
Ma, Huimin
Wan, Ji
Li, Bo
Xia, Tian
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
[9] Learnable graph convolutional network and feature fusion for multi-view learning
Chen, Zhaoliang
Fu, Lele
Yao, Jie
Guo, Wenzhong
Plant, Claudia
Wang, Shiping
[J]. INFORMATION FUSION, 2023, 95 : 109 - 119
[10] Chua T.-S., 2009, P ACM INT C IM VIS R

← 1 2 3 4 5 →