Twitter Dataset and Evaluation of Transformers for Turkish Sentiment Analysis

被引:12
|
作者
Koksal, Abdullatif [1 ]
Ozgur, Arzucan [1 ]
机构
[1] Bogazici Univ, Bilgisayar Muhendisligi Bolumu, Istanbul, Turkey
来源
29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021) | 2021年
关键词
sentiment analysis; Turkish dataset; Twitter; BounTi; transformers; BERT;
D O I
10.1109/SIU53274.2021.9477814
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Sentiment analysis is one of the key topics in Natural Language Processing which helps several applications from social media analysis to stock market prediction. Sentiment analysis datasets are generally collected by semi-supervision through shopping or review websites. These datasets are constructed by mapping users' text reviews to the given scores by users. However, these datasets might contain errors due to automatic mapping, and generally they don't have the characteristic features of social media texts such as emojis, slangs, and typos. To address these problems, one of the first manually annotated Turkish Sentiment Analysis datasets from Twitter is proposed. The BounTi dataset contains Turkish tweets about specific universities at Turkey. Furthermore, the performance of multilingual and Turkish transformer models such as MBERT, XLM-Roberta, and BERTurk are analyzed for this dataset. The best proposed model is based on BERTurk and achieves 0.729 macro-averaged recall score on the test set. Finally, a social media analysis demonstration with the best model is performed on Turkish tweets about a food brand. BounTi dataset, finetuned models, and related scripts are publicly released.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Semantic Patterns for Sentiment Analysis of Twitter
    Saif, Hassan
    He, Yulan
    Fernandez, Miriam
    Alani, Harith
    SEMANTIC WEB - ISWC 2014, PT II, 2014, 8797 : 324 - 340
  • [32] Benchmarking Twitter Sentiment Analysis Tools
    Abbasi, Ahmed
    Hassan, Ammar
    Dhar, Milan
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 823 - 829
  • [33] Sentiment Analysis and Trend Detection in Twitter
    del Pilar Salas-Zarate, Maria
    Medina-Moreira, Jose
    Javier Alvarez-Sagubay, Paul
    Lagos-Ortiz, Katty
    Andres Paredes-Valverde, Mario
    Valencia-Garcia, Rafael
    TECHNOLOGIES AND INNOVATION, 2016, 658 : 63 - 76
  • [34] Sentiment Analysis on Algerian Dialect with Transformers
    Benmounah, Zakaria
    Boulesnane, Abdennour
    Fadheli, Abdeladim
    Khial, Mustapha
    APPLIED SCIENCES-BASEL, 2023, 13 (20):
  • [35] Harvesting Opinions in Twitter for Sentiment Analysis
    Guevara, Juan
    Costa, Joana
    Arroba, Jorge
    Silva, Catarina
    2018 13TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2018,
  • [36] Sentiment Analysis of Hollywood Movies on Twitter
    Hodeghatta, Umesh Rao
    2013 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2013, : 1401 - 1404
  • [37] Sentiment Analysis of Turkish Reviews on Google Play Store
    Sigirci, Ibrahim Onur
    Ozgur, Hakan
    Oluk, Abdullah
    Uz, Harun
    Cetiner, Emrah
    Oktay, Hande Uzun
    Erdemir, Kaan
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2020, : 314 - 317
  • [38] Enhanced Sentiment Analysis Algorithms for Multi-Weight Polarity Selection on Twitter Dataset
    Mostafa, Ayman Mohamed
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (01) : 1015 - 1034
  • [39] Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers
    Cam, Handan
    Cam, Alper Veli
    Demirel, Ugur
    Ahmed, Sana
    HELIYON, 2024, 10 (01)
  • [40] Cyberbullying Detection in Twitter Using Sentiment Analysis
    Theng, Chong Poh
    Othman, Nur Fadzilah
    Abdullah, Raihana Syahirah
    Anawar, Syarulnaziah
    Ayop, Zakiah
    Ramli, Sofia Najwa
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (11): : 1 - 10