A reliable sentiment analysis for classification of tweets in social networks

被引：6

作者：

AminiMotlagh, Masoud ^{[1
]}

Shahhoseini, HadiShahriar ^{[1
]}

Fatehi, Nina ^{[2
]}

机构：

[1] Iran Univ Sci & Technol, Sch Elect Engn, Tehran, Iran

[2] Wayne State Univ, Dept Elect & Comp Engn, Detroit, MI USA

来源：

SOCIAL NETWORK ANALYSIS AND MINING | 2022年 / 13卷 / 01期

关键词：

Social networks analysis; Sentiment analysis; Data mining; Text mining; TWITTER;

D O I：

10.1007/s13278-022-00998-2

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In modern society, the use of social networks is more than ever and they have become the most popular medium for daily communications. Twitter is a social network where users are able to share their daily emotions and opinions with tweets. Sentiment analysis is a method to identify these emotions and determine whether a text is positive, negative, or neutral. In this article, we apply four widely used data mining classifiers, namely K-nearest neighbor, decision tree, support vector machine, and naive Bayes, to analyze the sentiment of the tweets. The analysis is performed on two datasets: first, a dataset with two classes (positive and negative) and then a three-class dataset (positive, negative and neutral). Furthermore, we utilize two ensemble methods to decrease variance and bias of the learning algorithms and subsequently increase the reliability. Also, we have divided the dataset into two parts: training set and testing set with different percentages of data to show the best train-test split ratio. Our results show that support vector machine demonstrates better outcomes compared to other algorithms, showing an improvement of 3.53% on dataset with two-class data and 7.41% on dataset with three-class data in accuracy rate compared to other algorithms. The experiments show that the accuracy of single classifiers slightly outperforms that of ensemble methods; however, they propose more reliable learning models. Results also demonstrate that using 50% of the dataset as training data has almost the same results as 70%, while using tenfold cross-validation can reach better results.

引用

页数：11

共 30 条

[1] AraSenCorpus: A Semi-Supervised Approach for Sentiment Annotation of a Large Arabic Text Corpus
Al-Laith, Ali
Shahbaz, Muhammad
Alaskar, Hind F.
Rehmat, Asim
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (05):
[2] Ali MZ, 2021, Arxiv, DOI arXiv:2105.01468
[3] Ankit, 2018, Procedia Computer Science, V132, P937, DOI 10.1016/j.procs.2018.05.109
[4] ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis
Basiri, Mohammad Ehsan
Nemati, Shahla
Abdar, Moloud
Cambria, Erik
Acharya, U. Rajendra
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 115 : 279 - 294
[5] Making sense of tweets using sentiment analysis on closely related topics
Bhatnagar, Sarvesh
Choubey, Nitin
[J]. SOCIAL NETWORK ANALYSIS AND MINING, 2021, 11 (01)
[6] A comprehensive analysis of adverb types for mining user sentiments on amazon product reviews
Chauhan, Ummara Ahmed
Afzal, Muhammad Tanvir
Shahid, Abdul
Abdar, Moloud
Basiri, Mohammad Ehsan
Zhou, Xujuan
[J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (03): : 1811 - 1829
[7] Analyzing the sentiment correlation between regular tweets and retweets
Chen, Jundong
Hossain, Md Shafaeat
Zhang, Huan
[J]. SOCIAL NETWORK ANALYSIS AND MINING, 2020, 10 (01)
[8] Tweets can tell: activity recognition using hybrid gated recurrent neural networks
Cui, Renhao
Agrawal, Gagan
Ramnath, Rajiv
[J]. SOCIAL NETWORK ANALYSIS AND MINING, 2020, 10 (01)
[9] Unsupervised Sentiment Analysis by Transferring Multi-source Knowledge
Dai, Yong
Liu, Jian
Zhang, Jian
Fu, Hongguang
Xu, Zenglin
[J]. COGNITIVE COMPUTATION, 2021, 13 (05) : 1185 - 1197
[10] Desai M, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), P149, DOI 10.1109/CCAA.2016.7813707

← 1 2 3 →