Learning multimodal word representation with graph convolutional networks

被引:11
作者
Zhu, Wenhao [1 ]
Liu, Shuang [1 ]
Liu, Chaoming [1 ]
机构
[1] Shanghai Univ, Shanghai 201900, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Natural language processing; Word representation; Multimodal word representation; Graph convolutional network;
D O I
10.1016/j.ipm.2021.102709
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multimodal models have been proven to outperform text-based models on learning semantic word representations. According to psycholinguistic theory, there is a graphical relationship among the modalities of language, and in recent years, the graph convolution network (GCN) has been proven to have substantial advantages in the extraction of non-European spatial features. This inspires us to propose a new multimodal word representation model, namely, GCNW, which uses the graph convolutional network to incorporate the phonetic and syntactic information into the word representation. We use a greedy strategy to update the modality-relation matrix in the GCN, and we train the model through unsupervised learning. We evaluated the proposed model on multiple downstream NLP tasks, and various experimental results demonstrate that the GCNW outperforms strong unimodal baselines and state-of-the-art multimodal models. We make the source code of both models available to encourage reproducible research.
引用
收藏
页数:11
相关论文
共 35 条
[1]   Integrating Experiential and Distributional Data to Learn Semantic Representations [J].
Andrews, Mark ;
Vigliocco, Gabriella ;
Vinson, David .
PSYCHOLOGICAL REVIEW, 2009, 116 (03) :463-498
[2]  
Bastings J., 2017, P 2017 C EMPIRICAL M, P1957
[3]  
Bengio S, 2014, INTERSPEECH, P1053
[4]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[5]   Geometric Deep Learning Going beyond Euclidean data [J].
Bronstein, Michael M. ;
Bruna, Joan ;
LeCun, Yann ;
Szlam, Arthur ;
Vandergheynst, Pierre .
IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (04) :18-42
[6]  
Chen YC, 2018, IEEE W SP LANG TECH, P941, DOI 10.1109/SLT.2018.8639553
[7]  
Collell G, 2017, AAAI CONF ARTIF INTE, P4378
[8]  
Collobert R, 2011, J MACH LEARN RES, V12, P2493
[9]  
Defferrard M, 2016, ADV NEUR IN, V29
[10]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171