Opinion mining on large scale data using sentiment analysis and k-means clustering

被引:46
作者
Riaz, Sumbal [1 ]
Fatima, Mehvish [1 ]
Kamran, M. [1 ]
Nisar, M. Wasif [1 ]
机构
[1] COMSATS Inst Informat Technol, Dept Comp Sci, Wah Cantt, Pakistan
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2019年 / 22卷 / Suppl 3期
关键词
Heterogeneous data processing; Imbalanced learning; Intelligent computing; CLASSIFICATION; ALGORITHMS; LEXICON; WORDS;
D O I
10.1007/s10586-017-1077-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid growth of web technology and easy access of internet, online shopping has been increased. Now people express their opinions and share their experiences that greatly influence new buyers for purchasing products, thereby generating large data sets. This large data is very helpful for analyzing customer preference, needs and its behavior toward a product. Companies face the challenge of analyzing this sheer amount of data to extract customer opinion. To address this challenge, in this paper, we performed sentiment analysis on the customer review real-world data at phrase level to find out customer preference by analyzing subjective expressions. Then we calculated the strength of sentiment word to find out the intensity of each expression and applied clustering for placing the words in various clusters based on their intensity. We also compared the results of our technique with star-ranking given on the same dataset and found the drastic change in our results. We also provide a visual representation of our results to provide a clear insight of customer preference and behavior to help decision makers for better decision making.
引用
收藏
页码:S7149 / S7164
页数:16
相关论文
共 50 条
  • [21] Sentiment Score Analysis for Opinion Mining
    Singh, Nidhi
    Sharma, Nonita
    Juneja, Akanksha
    MACHINE INTELLIGENCE AND SIGNAL ANALYSIS, 2019, 748 : 363 - 374
  • [22] Enhanced bisecting k-means clustering using intermediate cooperation
    Kashef, R.
    Kamel, M. S.
    PATTERN RECOGNITION, 2009, 42 (11) : 2557 - 2569
  • [23] Acute Leukemia Classification by Using SVM and K-Means Clustering
    Laosai, Jakkrich
    Chamnongthai, Kosin
    2014 INTERNATIONAL ELECTRICAL ENGINEERING CONGRESS (IEECON), 2014,
  • [24] Distance Analysis Measuring for Clustering using K-Means and Davies Bouldin Index Algorithm
    Idrus, Ali
    Tarihoran, Nafan
    Supriatna, Ucup
    Tohir, Ahmad
    Suwarni, Suwarni
    Rahim, Robbi
    TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2022, 11 (04): : 1871 - 1876
  • [25] A survey on classification techniques for opinion mining and sentiment analysis
    Hemmatian, Fatemeh
    Sohrabi, Mohammad Karim
    ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (03) : 1495 - 1545
  • [26] Robust deep k-means: An effective and simple method for data clustering
    Huang, Shudong
    Kang, Zhao
    Xu, Zenglin
    Liu, Quanhui
    PATTERN RECOGNITION, 2021, 117
  • [27] Iterated Watersheds, A Connected Variation of K-Means for Clustering GIS Data
    Soor, Sampriti
    Challa, Aditya
    Danda, Sravan
    Sagar, B. S. Daya
    Najman, Laurent
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (02) : 626 - 636
  • [28] A sample-based hierarchical adaptive K-means clustering method for large-scale video retrieval
    Liao, Kaiyang
    Liu, Guizhong
    Xiao, Li
    Liu, Chaoteng
    KNOWLEDGE-BASED SYSTEMS, 2013, 49 : 123 - 133
  • [29] Mining netizen's opinion on cryptocurrency: sentiment analysis of Twitter data
    Hassan, M. Kabir
    Hudaefi, Fahmi Ali
    Caraka, Rezzy Eko
    STUDIES IN ECONOMICS AND FINANCE, 2022, 39 (03) : 365 - 385
  • [30] Comparison and Detection Analysis of Network Traffic Datasets Using K-Means Clustering Algorithm
    Al-Sanjary, Omar Ismael
    Bin Roslan, Muhammad Aiman
    Helmi, Rabab Alayham Abbas
    Ahmed, Ahmed Abdullah
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2020, 19 (03)