Translation Is Not Enough: Comparing Lexicon-based Methods for Sentiment Analysis in Persian

被引:0
|
作者
Basiri, Mohammad Ehsan [1 ]
Kabiri, Arman [1 ]
机构
[1] Shahrekord Univ, Dept Comp Engn, Shahrekord, Iran
来源
2017 18TH CSI INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING CONFERENCE (CSSE) | 2017年
关键词
component; Sentiment Analysis; Natural Language Processing; Persian Language; Lexicon-based approach; Opinion mining; Data Mining; SOCIAL MEDIA;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sentiment analysis is a subfield of data mining and natural language processing with the aim of extracting people's opinion and appraisals from their comments on the Web. Contrary to machine learning approach, lexicon-based methods have some important advantages like domain-independency and being needless of a large annotated training corpus and hence are faster. This makes lexicon-based approach prevalent in the sentiment analysis community. However, for Persian language, in contrast to English, using lexicon-based method is a new discipline. There are limited lexicons available for sentiment analysis in Persian, almost all of them are directly translated from English. In the current study, four lexicons are compared to show the importance of lexicons in the performance of document-level sentiment analysis. Specifically, the Persian version of NRC lexicon, SentiStrength, CNRC, and Adjectives are compared in a pure lexicon-based scenario. Experiments are carried out on the document-level edition of SPerSent dataset. Results show that direct translation used in NRC leads the poorest performance while pre-processing and refining lexicons used in SentiStrength and CNRC improves the performance. Also, the results show that using just adjectives leads to higher results in comparison to using NRC.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [1] Lexicon-based Sentiment Analysis for Urdu Language
    Ul Rehman, Zia
    Bajwa, Imran Sarwar
    2016 SIXTH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING TECHNOLOGY (INTECH), 2016, : 497 - 501
  • [2] A generic lexicon-based framework for sentiment analysis
    Moussa M.E.
    Mohamed E.H.
    Haggag M.H.
    International Journal of Computers and Applications, 2020, 42 (05) : 463 - 473
  • [3] Lexicon-Based Sentiment Analysis for Movie Review Tweets
    Azizan, Azilawati
    Jamal, Nurul Najwa S. K. Abdul
    Abdullah, Mohammad Nasir
    Mohamad, Masurah
    Khairuddin, Nurkhairizan
    2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA SCIENCES (AIDAS2019), 2019, : 132 - 136
  • [4] Developing Lexicon-based Algorithms and Sentiment Lexicon for Sentiment Analysis of Saudi Dialect Tweets
    Al-Ghaith, Waleed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (11) : 83 - 88
  • [5] The wisdom of the lexicon crowds: leveraging on decades of lexicon-based sentiment analysis for improved results
    Chelsey H. Hill
    Jorge E. Fresneda
    Murugan Anandarajan
    Journal of Big Data, 12 (1)
  • [6] Arabic Sentiment Analysis: Lexicon-based and Corpus-based
    Abdulla, Nawaf A.
    Ahmed, Nizar A.
    Shehab, Mohammed A.
    Al-Ayyoub, Mahmoud
    2013 IEEE JORDAN CONFERENCE ON APPLIED ELECTRICAL ENGINEERING AND COMPUTING TECHNOLOGIES (AEECT), 2013,
  • [7] Lexicon-Based Sentiment Analysis in Behavioral Research
    Ian Cero
    Jiebo Luo
    John Michael Falligant
    Perspectives on Behavior Science, 2024, 47 : 283 - 310
  • [8] Words Are Important: Improving Sentiment Analysis in the Persian Language by Lexicon Refining
    Basiri, Mohammad Ehsan
    Kabiri, Arman
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2018, 17 (04)
  • [9] Lexicon-Based Sentiment Analysis in Behavioral Research
    Cero, Ian
    Luo, Jiebo
    Falligant, John Michael
    PERSPECTIVES ON BEHAVIOR SCIENCE, 2024, 47 (01) : 283 - 310
  • [10] The Lexicon-based Sentiment Analysis for Fan Page Ranking in Facebook
    Ngoc, Phan Trong
    Yoo, Myungsik
    2014 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2014), 2014, : 444 - 448