An unsupervised data-driven cross-lingual method for building high precision sentiment lexicons

被引:1
作者
Sangiorgi, Pierluca [1 ]
Augello, Agnese [1 ]
Pilato, Giovanni [1 ]
机构
[1] CNR, ICAR Ist Calcolo & Reti Ad Alte Prestazioni, I-90128 Palermo, Italy
来源
2013 IEEE SEVENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2013) | 2013年
关键词
Sentiment Analysis; Sentiment Lexicon; Machine Learning;
D O I
10.1109/ICSC.2013.40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a completely unsupervised approach for creating a sentiment lexicon. The approach has been realized by designing a pipeline which implements an unsupervised system that covers different aspects: the automatic extraction of user reviews, the pre-processing of text, the use of a scoring measure which combines: entropy, term frequency, inverse document frequency, and finally a cross lingual intersection. We have validated the approach though the analysis of app reviews present in the Google Play market. The results show the effectiveness of the approach given by satisfactory values of precision for the obtained lexicon.
引用
收藏
页码:184 / 190
页数:7
相关论文
共 17 条
[1]  
[Anonymous], 2010, P INT C LANG RES EV
[2]  
[Anonymous], 2007, P 45 ANN M ASS COMP
[3]  
[Anonymous], 2011, P 2 WORKSH COMP APPR
[4]  
[Anonymous], 2012, P 8 INT C LANG RES E
[5]  
Banea C., 2011, ACL 2012
[6]  
Banea R. M. Carmen, 2008, P 6 INT C LANG RES E
[7]  
Kaji N., 2007, P 2007 JOINT EMP MET
[8]   WORDNET - A LEXICAL DATABASE FOR ENGLISH [J].
MILLER, GA .
COMMUNICATIONS OF THE ACM, 1995, 38 (11) :39-41
[9]   Subjectivity and sentiment analysis: An overview of the current state of the area and envisaged developments [J].
Montoyo, Andres ;
Martinez-Barco, Patricio ;
Balahur, Alexandra .
DECISION SUPPORT SYSTEMS, 2012, 53 (04) :675-679
[10]  
Neviarouskaya A, 2007, LECT NOTES COMPUT SC, V4738, P218