Cross-lingual sentiment classification with stacked autoencoders

被引:19
作者
Zhou, Guangyou [1 ]
Zhu, Zhiyuan [3 ]
He, Tingting [2 ]
Hu, Xiaohua Tony [1 ,4 ]
机构
[1] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China
[2] Cent China Normal Univ, Sch Comp, Nat Language Proc Lab, Wuhan 430079, Peoples R China
[3] Chinese Inst Elect, Beijing 100036, Peoples R China
[4] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA
基金
中国国家自然科学基金; 北京市自然科学基金; 美国国家科学基金会;
关键词
Sentiment classification; Cross-lingual; Stacked autoencoder;
D O I
10.1007/s10115-015-0849-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual sentiment classification is a popular research topic in natural language processing. The fundamental challenge of cross-lingual learning stems from a lack of overlap between the feature spaces of the source language data and the target language data. In this article, we propose a new model which uses stacked autoencoders to learn language-independent high-level feature representations for the both languages in an unsupervised fashion. The proposed framework aims to force the aligned input bilingual sentences into a common latent space, and the objective function is defined by minimizing the input and output vector representations as well as the distance of the common representations in the latent space. Sentiment classifiers trained on the source language can be adapted to predict sentiment polarity of the target language with the language-independent high-level feature representations. We conduct extensive experiments on English-Chinese sentiment classification tasks of multiple data sets. Our experimental results demonstrate the efficacy of the proposed cross-lingual approach.
引用
收藏
页码:27 / 44
页数:18
相关论文
共 50 条
[1]  
[Anonymous], 2014, P COLING 2014 25 INT
[2]  
[Anonymous], 2013, Proceedings of the Conference on Empirical Methods in Natural Language Processing
[3]  
[Anonymous], 2010, P 23 INT C COMP LING
[4]  
[Anonymous], 2004, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, DOI 10.3115/1218955.1218990
[5]  
[Anonymous], 2013, Bilingual word embeddings for phrasebased machine translation
[6]  
[Anonymous], 2009, P JOINT C 47 ANN M A
[7]  
[Anonymous], P 3 INT JOINT C NAT
[8]  
[Anonymous], 2010, HUMAN LANGUAGE TECHN
[9]  
[Anonymous], LANGUAGE RESOURCES E
[10]  
[Anonymous], 2011, P 49 ANN M ASS COMPU, DOI DOI 10.5555/2002736.2002823