Sentiment Analysis using Feature Generation And Machine Learning Approach

被引:1
作者
Srivastava, Roopam [1 ]
Bharti, P. K. [1 ]
Verma, Parul [2 ]
机构
[1] Shri Venkateshwara Univ, Comp Sci Dept, Gajraula, UP, India
[2] Amity Univ, Amity Inst Informat Technol, Lucknow, Uttar Pradesh, India
来源
2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS) | 2021年
关键词
Machine Learning; Natural Language Processing; Text Classification;
D O I
10.1109/ICCCIS51004.2021.9397135
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The study of opinion offers answers to what the most critical problems are. Since sentiment analysis can be automated, judgements can be taken based on a significant amount of data rather than plain intuition, which is not always accurate. This paper focuses on the feature generation using Bag-of-Words and TF-IDF and the build model using the machine learning approach for sentiment analysis. The dataset used contains review of trip advisor on various hotels. This dataset consists of 20k reviews. Word cloud had been formed using sentiment ratings. Data was cleaned and pre-processed, and then applied Bow and TF-IDF for feature extraction. After implementation of classifiers, training and evaluation was performed. Evaluation metrics is used for measuring the accuracy of classifier. MultinomialNB obtained the highest accuracy in the realm of Bag of Word features and random forest outperformed in the case of TF-IDF out of three classifiers used to determine accuracy. We got 82% of the classification rate of MultinomialNB in Bag of Word and 78% accuracy in TF-IDF Random Forest.
引用
收藏
页码:86 / 91
页数:6
相关论文
共 15 条
[1]  
Ahuja Ravinder, 2019, Procedia Computer Science, V152, P341, DOI 10.1016/j.procs.2019.05.008
[2]  
Bilgin M, 2019, INT ARAB J INF TECHN, V16, P953
[3]  
Concetta A., 2014, GET YOUR HEAD CLOUDS, DOI [10.1007/s11528-014, DOI 10.1007/S11528-014]
[4]  
Das B., 2018, An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation
[5]  
Dsouza D. D., 2019, INT J RECENT TECHNOL, V8
[6]  
El-Din DM, 2016, INT J ADV COMPUT SC, V7, P244
[7]  
Gottron T, 2009, DOCUMENT WORD CLOUDS, DOI [10.1007/978-3- 642-04346-8_11, DOI 10.1007/978-3-642-04346-8_11]
[8]   Word Cloud Explorer: Text Analytics based on Word Clouds [J].
Heimerl, Florian ;
Lohmann, Steffen ;
Lange, Simon ;
Ertl, Thomas .
2014 47TH HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES (HICSS), 2014, :1833-1842
[9]   Research paper classification systems based on TF-IDF and LDA schemes [J].
Kim, Sang-Woon ;
Gil, Joon-Min .
HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2019, 9 (01)
[10]  
Nasser I, 2020, INT J ENG INFORM SYS, P6