Translation Is Not Enough: Comparing Lexicon-based Methods for Sentiment Analysis in Persian

被引:0
|
作者
Basiri, Mohammad Ehsan [1 ]
Kabiri, Arman [1 ]
机构
[1] Shahrekord Univ, Dept Comp Engn, Shahrekord, Iran
来源
2017 18TH CSI INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING CONFERENCE (CSSE) | 2017年
关键词
component; Sentiment Analysis; Natural Language Processing; Persian Language; Lexicon-based approach; Opinion mining; Data Mining; SOCIAL MEDIA;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sentiment analysis is a subfield of data mining and natural language processing with the aim of extracting people's opinion and appraisals from their comments on the Web. Contrary to machine learning approach, lexicon-based methods have some important advantages like domain-independency and being needless of a large annotated training corpus and hence are faster. This makes lexicon-based approach prevalent in the sentiment analysis community. However, for Persian language, in contrast to English, using lexicon-based method is a new discipline. There are limited lexicons available for sentiment analysis in Persian, almost all of them are directly translated from English. In the current study, four lexicons are compared to show the importance of lexicons in the performance of document-level sentiment analysis. Specifically, the Persian version of NRC lexicon, SentiStrength, CNRC, and Adjectives are compared in a pure lexicon-based scenario. Experiments are carried out on the document-level edition of SPerSent dataset. Results show that direct translation used in NRC leads the poorest performance while pre-processing and refining lexicons used in SentiStrength and CNRC improves the performance. Also, the results show that using just adjectives leads to higher results in comparison to using NRC.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [21] Effective lexicon-based approach for Urdu sentiment analysis
    Mukhtar, Neelam
    Khan, Mohammad Abid
    ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (04) : 2521 - 2548
  • [22] Bias-Aware Lexicon-Based Sentiment Analysis
    Iqbal, Mohsin
    Karim, Asim
    Kamiran, Faisal
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 845 - 850
  • [23] A Multilingual Lexicon-based Approach for Sentiment Analysis in Social and Cultural Information System Data
    Jardim, Sandra
    Mora, Carlos
    Santana, Tiago
    PROCEEDINGS OF 2021 16TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI'2021), 2021,
  • [24] Identifying the Overlap between Election Result and Candidates' Ranking based on Hashtag-Enhanced, Lexicon-Based Sentiment Analysis
    Rezapour, Rezvaneh
    Wang, Lufan
    Abdar, Omid
    Diesner, Jana
    2017 11TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2017, : 93 - 96
  • [25] An Italian Lexicon-based Sentiment Analysis approach for medical applications
    Martinis, Maria Chiara
    Zucco, Chiara
    Cannataro, Mario
    13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022, 2022,
  • [26] Lexicon-based sentiment analysis in texts using Formal Concept Analysis
    Ojeda-Hernandez, Manuel
    Lopez-Rodriguez, Domingo
    Mora, Angel
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2023, 155 : 104 - 112
  • [27] Fast and Accurate - Improving Lexicon-Based Sentiment Classification with an Ensemble Methods
    Augustyniak, Lukasz
    Szymanski, Piotr
    Kajdanowicz, Tomasz
    Kazienko, Przemyslaw
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2016, PT II, 2016, 9622 : 108 - 116
  • [28] Access to credit and fintech: A lexicon-based sentiment analysis application on Twitter data
    Bredice, Marilena
    Formisano, Anna Vittoria
    Kullafi, Sara
    Palma, Pasquale
    RESEARCH IN INTERNATIONAL BUSINESS AND FINANCE, 2025, 77
  • [29] Lexicon-Based Text Analysis for Twitter and Quora
    Nishant, Potnuru Sai
    Mohan, Bhaskaruni Gopesh Krishna
    Chandra, Balina Surya
    Lokesh, Yangalasetty
    Devaraju, Gantakora
    Revanth, Madamala
    INNOVATIVE DATA COMMUNICATION TECHNOLOGIES AND APPLICATION, 2020, 46 : 276 - 283
  • [30] An enhanced lexicon-based approach for sentiment analysis: a case study on illegal immigration
    Mehmood, Yasir
    Balakrishnan, Vimala
    ONLINE INFORMATION REVIEW, 2020, 44 (05) : 1097 - 1117