Joint long and short span self-attention network for multi-view classification

被引：1

作者：

Chen, Zhikui ^{[1
]}

Lou, Kai ^{[1
]}

Liu, Zhenjiao ^{[1
]}

Li, Yue ^{[1
]}

Luo, Yiming ^{[1
]}

Zhao, Liang ^{[1
]}

机构：

[1] Dalian Univ Technol, Sch Software, Dalian 116620, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 235卷

关键词：

Multi-view classification; Self-attention mechanism; Multi-view fusion; DIMENSIONALITY; MODEL;

D O I：

10.1016/j.eswa.2023.121152

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-view classification aims to efficiently utilize information from different views to improve classification performance. In recent researches, many effective multi-view learning methods have been proposed to perform multi-view data analysis. However, most existing methods only consider the correlations between views but ignore the potential correlations between samples. Normally, the views of samples belonging to the same category should have more consistency information and those belonging to different categories should have more distinctions. Therefore, we argue that the correlations and distinctions between the views of different samples also contribute to the construction of feature representations that are more conducive to classification. In order to construct a end-to-end general multi-view classification framework that can better utilize sample information to obtain more reasonable feature representation, we propose a novel joint long and short span self -attention network (JLSSAN). We designed two different self-attention spans to focus on different information. They enable each feature vector to be iteratively updated based on its attention to other views and other samples, which provides better integration of information from different views and different samples. Besides, we adopt a novel weight-based loss fusion strategy, which facilitates the model to learn more reasonable self-attention map between views. Our method outperforms the state-of-the-art methods by more than 3% in accuracy on multiple benchmarks, which demonstrates that our method is effective.

引用

页数：10

共 47 条

[11] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[12] Dosovitskiy A., 2020, INT C LEARNING REPRE
[13] Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving
Fadadu, Sudeep
Pandey, Shreyash
Hegde, Darshan
Shi, Yi
Chou, Fang-Chieh
Djuric, Nemanja
Vallespi-Gonzalez, Carlos
[J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3292 - 3300
[14] Fei-Fei L, 2005, PROC CVPR IEEE, P524
[15] Generative Adversarial Networks
Goodfellow, Ian
Pouget-Abadie, Jean
Mirza, Mehdi
Xu, Bing
Warde-Farley, David
Ozair, Sherjil
Courville, Aaron
Bengio, Yoshua
[J]. COMMUNICATIONS OF THE ACM, 2020, 63 (11) : 139 - 144
[16] Pseudolabel-guided multiview consensus graph learning for semisupervised classification
Guo, Wei
Wang, Zhe
Du, Wenli
[J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (11) : 8611 - 8634
[17] Trusted Multi-View Classification With Dynamic Evidential Fusion
Han, Zongbo
Zhang, Changqing
Fu, Huazhu
Zhou, Joey Tianyi
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2551 - 2566
[18] Masked Autoencoders Are Scalable Vision Learners
He, Kaiming
Chen, Xinlei
Xie, Saining
Li, Yanghao
Dollar, Piotr
Girshick, Ross
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15979 - 15988
[19] Reducing the dimensionality of data with neural networks
Hinton, G. E.
Salakhutdinov, R. R.
[J]. SCIENCE, 2006, 313 (5786) : 504 - 507
[20] Hotelling H, 1936, BIOMETRIKA, V28, P321, DOI 10.2307/2333955

← 1 2 3 4 5 →