Learning to Switch off, Switch on, and Integrate Modalities in Large Pre-trained Transformers

被引：0

作者：

Duseja, Tejas ^{[1
]}

Annervaz, K. M. ^{[1
]}

Duggani, Jeevithiesh ^{[1
]}

Zacharia, Shyam ^{[2
]}

Free, Michael ^{[3
]}

Dukkipati, Ambedkar ^{[1
]}

机构：

[1] Indian Inst Sci, Bengaluru, India

[2] British Telcom, Bengaluru, India

[3] British Telcom, London, England

来源：

2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024 | 2024年

关键词：

Multi-modal emotion recognition; sentiment analysis; pre-trained models;

D O I：

10.1109/MIPR62202.2024.00070

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformer models that revolutionized foundation models are ubiquitous nowadays. Hence, there has been a surge in pre-trained transformers that can be fine-tuned to perform different downstream tasks. Most pre-trained transformers are trained only on a single modality, and there is no direct way to fine-tune them in multiple modalities. To tackle this issue, in this paper, we propose a general-purpose gate, SSIM (Switch off, Switch on, and Integrate Modalities), by which one can integrate other modalities into large pre-trained language transformers. The proposed SSIM gate helps to obtain the unified representation by soft-switching between multi-modal interactions. To evaluate our approach, we have established benchmarks using pre-trained language transformers like BERT, XLNet, and T5 on multi-modal tasks such as Sentiment and Emotion analysis (CMU-MOSI, CMU-MOSEI), Emotion Recognition in Conversations (IEMOCAP, MELD), and Multimodal Intent Recognition (MIntRec), achieving close to State-of-the-art results.

引用

页码：403 / 409

页数：7

共 33 条

[31] A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data
Lossio-Ventura, Juan Antonio
Weger, Rachel
Lee, Angela Y.
Guinee, Emily P.
Chung, Joyce
Atlas, Lauren
Linos, Eleni
Pereira, Francisco
JMIR MENTAL HEALTH, 2024, 11
[32] A Classification-Detection Approach of COVID-19 Based on Chest X-ray and CT by Using Keras Pre-Trained Deep Learning Models
Deng, Xing
Shao, Haijian
Shi, Liang
Wang, Xia
Xie, Tongling
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2020, 125 (02): : 579 - 596
[33] Crop monitoring using remote sensing land use and land change data: Comparative analysis of deep learning methods using pre-trained CNN models
Peng, Min
Liu, Yunxiang
Khan, Asad
Ahmed, Bilal
Sarker, Subrata K.
Ghadi, Yazeed Yasin
Bhatti, Uzair Aslam
Al-Razgan, Muna
Ali, Yasser A.
BIG DATA RESEARCH, 2024, 36

← 1 2 3 4 →