Distributed Sentiment Analysis for Geo-Tagged Twitter Data

被引:0
|
作者
Zengin, Muhammed Said [1 ]
Arslan, Rabia [1 ]
Akgun, Mehmet Burak [1 ]
机构
[1] TOBB Ekon & Teknol Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey
来源
2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU | 2022年
关键词
Big data; distributed data processing; sentiment analysis; BERT;
D O I
10.1109/SIU55565.2022.9864702
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The ever-increasing frequency of sharing on social media makes these platforms one of the primary sources of data for computational social science studies. Similarly, examining and analyzing large scale social media data-sets is crucial for governments as well as companies. However, as the amount of data increases, insights that need to be derived from the data using artificial intelligence based models becomes more and more demanding in terms of processing power. In fact, hardware requirements might dramatically increase if the insights are needed under real-time or near-real time constraints. In this study, we developed a distributed sentiment analysis model that utilizes a large social media data-set. 16 million tweets have been collected and grouped by the originating city. The sentiment analysis model was produced by fine-tuning the pre-trained BERT model. Distributed big data analytics engine, Apache Spark, is used to execute the trained model in a distributed fashion. For evaluation purposes, the prediction time on a single compute unit is compared with the distributed prediction time. Sentiment analysis model has been executed separately for each of the data-groups corresponding to 81 provinces. The data-set containing 16 million tweets used in this study, the Turkish sentiment analysis model produced, the distributed prediction code developed for Apache Spark and all the results of the study can be accessed from the address https://distributed-sentiment-analysis.github.io/.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Geo-Tagged Photo Metadata Processing Method for Beijing Inbound Tourism Flow
    Chen, Wen
    Xu, Zhiyun
    Zheng, Xiaoyao
    Luo, Yonglong
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2019, 8 (12)
  • [32] Sentiment Analysis of Twitter Data based on Ordinal Classification
    Elbagir, Shihab
    Yang, Jing
    2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [33] A Topic based Approach for Sentiment Analysis on Twitter Data
    Ficamos, Pierre
    Liu, Yan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (12) : 201 - 205
  • [34] Sentiment Analysis of Twitter Data in Online Social Network
    Dhawan, Sanjeev
    Singh, Kulvinder
    Chauhan, Priyanka
    PROCEEDINGS OF 2019 5TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTING AND CONTROL (ISPCC 2K19), 2019, : 255 - 259
  • [35] Sentiment Analysis on Automobile Brands Using Twitter Data
    Asghar, Zain
    Ali, Tahir
    Ahmad, Imran
    Tharanidharan, Sridevi
    Nazar, Shamim Kamal Abdul
    Kamal, Shahid
    INTELLIGENT TECHNOLOGIES AND APPLICATIONS, INTAP 2018, 2019, 932 : 76 - 85
  • [36] Sentiment Analysis on COVID-19 Twitter Data
    Vijay, Tanmay
    Chawla, Ayan
    Dhanka, Balan
    Karmakar, Purnendu
    2020 5TH IEEE INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (IEEE - ICRAIE-2020), 2020,
  • [37] On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
    Saif, Hassan
    Fernandez, Miriam
    He, Yulan
    Alani, Harith
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 810 - 817
  • [38] Techniques for Sentiment Analysis of Twitter Data: A Comprehensive Survey
    Desai, Mitali
    Mehta, Mayuri A.
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 149 - 154
  • [39] Sentiment Analysis of Twitter Data about Blockchain Technology
    Rocha, Rayana Souza
    Saraiva, Lohanna Aires
    de Castro, Angelica Felix
    Silva, Patricio de Alencar
    PROCEEDINGS OF THE 10TH EURO-AMERICAN CONFERENCE ON TELEMATICS AND INFORMATION SYSTEMS (EATIS 2020), 2020,
  • [40] Collection and Sentiment Analysis of Twitter Data on the Political Atmosphere
    Cisija, Merima
    Zunic, Emir
    Donko, Dzenana
    2018 14TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2018,