Associating targets with SentiUnits: a step forward in sentiment analysis of Urdu text

被引:27
作者
Syed, Afraz Z. [1 ]
Aslam, Muhammad [1 ]
Maria Martinez-Enriquez, Ana [2 ]
机构
[1] Univ Engn & Technol, Lahore, Pakistan
[2] CINVESTAV IPN, Dept CS, Mexico City, DF, Mexico
关键词
Natural language processing; Sentiment analysis; Opinion mining; Shallow parsing; Dependency parsing;
D O I
10.1007/s10462-012-9322-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents, a grammatically motivated, sentiment classification model, applied on a morphologically rich language: Urdu. The morphological complexity and flexibility in grammatical rules of this language require an improved or altogether different approach. We emphasize on the identification of the SentiUnits, rather than, the subjective words in the given text. SentiUnits are the sentiment carrier expressions, which reveal the inherent sentiments of the sentence for a specific target. The targets are the noun phrases for which an opinion is made. The system extracts SentiUnits and the target expressions through the shallow parsing based chunking. The dependency parsing algorithm creates associations between these extracted expressions. For our system, we develop sentiment-annotated lexicon of Urdu words. Each entry of the lexicon is marked with its orientation (positive or negative) and the intensity (force of orientation) score. For the evaluation of the system, two corpora of reviews, from the domains of movies and electronic appliances are collected. The results of the experimentation show that, we achieve the state of the art performance in the sentiment analysis of the Urdu text.
引用
收藏
页码:535 / 561
页数:27
相关论文
共 47 条
[1]   Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums [J].
Abbasi, Ahmed ;
Chen, Hsinchun ;
Salem, Arab .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2008, 26 (03)
[2]  
Abdul-Mageed M., 2010, PROC 1 WORK COMPUT A, P2
[3]  
Adreevskaia Alina., 2006, 11 C EUROPEAN CHAPTE, P209
[4]  
Annett M, 2008, LECT NOTES ARTIF INT, V5032, P25
[5]  
[Anonymous], 2009, P 7 WORKSH AS LANG R
[6]  
[Anonymous], 2004, P 2004 C EMP METH NA
[7]  
[Anonymous], 2003, Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4, CONLL'03
[8]  
[Anonymous], 2000, P 18 INT C COMP LING
[9]  
[Anonymous], 2007, Hlt-naacl
[10]  
Baker P, 2003, P EACL WORKSH S AS L