Framework for Improved Sentiment Analysis via Random Minority Oversampling for User Tweet Review Classification

被引:17
作者
Almuayqil, Saleh Naif [1 ]
Humayun, Mamoona [1 ]
Jhanjhi, N. Z. [2 ]
Almufareh, Maram Fahaad [1 ]
Javed, Danish [2 ]
机构
[1] Jouf Univ, Coll Comp & Informat Sci, Dept Informat Syst, Sakakah 72311, Saudi Arabia
[2] Taylors Univ, Sch Comp Sci SCS, Subang Jaya 47500, Malaysia
关键词
sentiment analysis (SA); sentiment classification; resampling; random minority oversampling; random majority under sampling; deep learning (DL); machine learning (ML); term frequency inverse document frequency (TF-IDF);
D O I
10.3390/electronics11193058
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social networks such as twitter have emerged as social platforms that can impart a massive knowledge base for people to share their unique ideas and perspectives on various topics and issues with friends and families. Sentiment analysis based on machine learning has been successful in discovering the opinion of the people using redundantly available data. However, recent studies have pointed out that imbalanced data can have a negative impact on the results. In this paper, we propose a framework for improved sentiment analysis through various ordered preprocessing steps with the combination of resampling of minority classes to produce greater performance. The performance of the technique can vary depending on the dataset as its initial focus is on feature selection and feature combination. Multiple machine learning algorithms are utilized for the classification of tweets into positive, negative, or neutral. Results have revealed that random minority oversampling can provide improved performance and it can tackle the issue of class imbalance.
引用
收藏
页数:17
相关论文
共 39 条
[1]  
Aljarah I., 2017, ONLINE SOCIAL MEDIA
[2]  
Alnatara WD, 2020, INT C ADV COMP SCI I, P165, DOI [10.1109/icacsis51025.2020.9263087, 10.1109/ICACSIS51025.2020.9263087]
[3]   MULDASA: Multifactor Lexical Sentiment Analysis of Social-Media Content in Nonstandard Arabic Social Media [J].
Alwakid, Ghadah ;
Osman, Taha ;
El Haj, Mahmoud ;
Alanazi, Saad ;
Humayun, Mamoona ;
Sama, Najm Us .
APPLIED SCIENCES-BASEL, 2022, 12 (08)
[4]  
Arabnia H.R., 2018, INFORM KNOWLEDGE ENG
[5]   Sentiment Analysis and Emotion Recognition from Speech Using Universal Speech Representations [J].
Atmaja, Bagus Tris ;
Sasou, Akira .
SENSORS, 2022, 22 (17)
[6]   Initial Stage COVID-19 Detection System Based on Patients' Symptoms and Chest X-Ray Images [J].
Attaullah, Muhammad ;
Ali, Mushtaq ;
Almufareh, Maram Fahhad ;
Ahmad, Muneer ;
Hussain, Lal ;
Jhanjhi, N. Z. ;
Humayun, Mamoona .
APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
[7]   A novel unsupervised ensemble framework using concept-based linguistic methods and machine learning for twitter sentiment analysis [J].
Bibi, Maryum ;
Abbasi, Wajid Arshad ;
Aziz, Wajid ;
Khalil, Sundus ;
Uddin, Mueen ;
Iwendi, Celestine ;
Gadekallu, Thippa Reddy .
PATTERN RECOGNITION LETTERS, 2022, 158 :80-86
[8]   A comprehensive survey on sentiment analysis: Approaches, challenges and trends [J].
Birjali, Marouane ;
Kasri, Mohammed ;
Beni-Hssane, Abderrahim .
KNOWLEDGE-BASED SYSTEMS, 2021, 226
[9]  
Bo Pang, 2008, Foundations and Trends in Information Retrieval, V2, P1, DOI 10.1561/1500000001
[10]   A sentiment classification model based on multiple classifiers [J].
Catal, Cagatary ;
Nangir, Mehmet .
APPLIED SOFT COMPUTING, 2017, 50 :135-141