Learning distributed word representation with multi-contextual mixed embedding

被引:46
作者
Li, Jianqiang [1 ]
Li, Jing [1 ]
Fu, Xianghua [1 ]
Masud, M. A. [1 ]
Huang, Joshua Zhexue [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
关键词
Word embedding; Distributed word representation; Word2vec; Natural language processing; MODELS;
D O I
10.1016/j.knosys.2016.05.045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning distributed word representations has been a popular method for various natural language processing applications such as word analogy and similarity, document classification and sentiment analysis. However, most existing word embedding models only exploit a shallow slide window as the context to predict the target word. Because the semantic of each word is also influenced by its global context, as the distributional models usually induced the. word representations from the global co-occurrence matrix, the window-based models are insufficient to capture semantic knowledge. In this paper, we propose a novel hybrid model called mixed word embedding (MWE) based on the well-known word2vec toolbox. Specifically, the proposed MWE model combines the two variants of word2vec, i.e., SKIP-GRAM and CBOW, in a seamless way via sharing a common encoding structure, which is able to capture the syntax information of words more accurately. Furthermore, it incorporates a global text vector into the CBOW variant so as to capture more semantic information. Our MWE preserves the same time complexity as the SKIP-GRAM. To evaluate our MWE model efficiently and adaptively, we study our model on linguistic and application perspectives with both English and Chinese dataset. For linguistics, we conduct empirical studies on word analogies and similarities. The learned latent representations on both document classification and sentiment analysis are considered for application point of view of this work. The experimental results show that our MWE model is very competitive in all tasks as compared with the state-of-the-art word embedding models such as CBOW, SKIP-GRAM, and GloVe. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:220 / 230
页数:11
相关论文
共 68 条
[1]   Hate Speech Detection with Comment Embeddings [J].
Djuric, Nemanja ;
Zhou, Jing ;
Morris, Robin ;
Grbovic, Mihajlo ;
Radosavljevic, Vladan ;
Bhamidipati, Narayan .
WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, :29-30
[2]  
[Anonymous], 2007, P 24 INT C MACH LEAR, DOI DOI 10.1145/1273496.1273577
[3]  
[Anonymous], 2013, P 2013 C N AM CHAPTE
[4]  
[Anonymous], 2013, COMPUTER SCI
[5]  
[Anonymous], 2008, Neural Inf. Process. Syst.
[6]  
[Anonymous], 2009, Advances in neural information processing systems
[7]  
[Anonymous], 2010, COLING 2010 POSTERS
[8]  
[Anonymous], 2011, Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '11
[9]  
[Anonymous], 2003, P 12 INT C WORLD WID, DOI DOI 10.1145/775152.775226
[10]  
[Anonymous], 2005, INT WORKSHOP ARTIFIC