Multimodal feature fusion for concreteness estimation

被引:0
作者
Incitti, Francesca [1 ]
Snidaro, Lauro [1 ]
机构
[1] Univ Udine, Dept Math Comp Sci & Phys, Udine, Italy
来源
2022 25TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2022) | 2022年
关键词
Word Embeddings; Feature fusion; NLP; Multimodal feature learning; ELMo; BERT; CLIP; Concreteness estimation; Autoencoders; Dimensionality reduction; AUTOENCODER;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years the idea of fusing diverse type of information has often been employed to solve various Deep Learning tasks. Whether these regard an NLP problem or a Machine Vision one, the concept of using more inputs of the same type has been the basis of many studies. Considering NLP problems, attempts of different word embeddings have already been tried, managing to make improvements to the most common benchmarks. Here we want to explore the combination not only of different types of input together, but also different data modalities. This is done by fusing two popular word embeddings together, mainly ELMo and BERT, with other inputs that embed a visual description of the analysed text. Doing so, different modalities -textual and visual- are both employed to solve a textual problem, a concreteness task. Multimodal feature fusion is here explored through several techniques: input redundancy, concatenation, average, dimensionality reduction and augmentation. By combining these techniques it is possible to generate different vector representations: the goal is to understand which feature fusion techniques allow to obtain more accurate embeddings.
引用
收藏
页数:8
相关论文
共 21 条
[1]   The interpretation of dream meaning: Resolving ambiguity using Latent Semantic Analysis in a small corpus of text [J].
Altszyler, Edgar ;
Ribeiro, Sidarta ;
Sigman, Mariano ;
Fernandez Slezak, Diego .
CONSCIOUSNESS AND COGNITION, 2017, 56 :178-187
[2]  
Bojanowski P., 2017, Transactions of the association for computational linguistics, V5, P135, DOI [10.1162/tacl_a_00051, 10.1162/tacla00051, DOI 10.1162/TACL_A_00051]
[3]   Multimodal Distributional Semantics [J].
Bruni, Elia ;
Nam Khanh Tran ;
Baroni, Marco .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2014, 49 :1-47
[4]   Concreteness ratings for 40 thousand generally known English word lemmas [J].
Brysbaert, Marc ;
Warriner, Amy Beth ;
Kuperman, Victor .
BEHAVIOR RESEARCH METHODS, 2014, 46 (03) :904-911
[5]  
Devlin Jacob, 2018, CoRR
[6]  
Incitti F, 2021, 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), P508
[7]   BioBERT: a pre-trained biomedical language representation model for biomedical text mining [J].
Lee, Jinhyuk ;
Yoon, Wonjin ;
Kim, Sungdong ;
Kim, Donghyeon ;
Kim, Sunkyu ;
So, Chan Ho ;
Kang, Jaewoo .
BIOINFORMATICS, 2020, 36 (04) :1234-1240
[8]   A Survey of Decision Fusion and Feature Fusion Strategies for Pattern Classification [J].
Mangai, Utthara Gosa ;
Samanta, Suranjana ;
Das, Sukhendu ;
Chowdhury, Pinaki Roy .
IETE TECHNICAL REVIEW, 2010, 27 (04) :293-307
[9]  
Mikolov T., 2013, P WORKSHOP ICLR 2013, P1
[10]   Comparative study of word embedding methods in topic segmentation [J].
Naili, Marwa ;
Chaibi, Anja Habacha ;
Ben Ghezala, Henda Hajjami .
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS, 2017, 112 :340-349