Opinion mining on large scale data using sentiment analysis and k-means clustering

被引:46
作者
Riaz, Sumbal [1 ]
Fatima, Mehvish [1 ]
Kamran, M. [1 ]
Nisar, M. Wasif [1 ]
机构
[1] COMSATS Inst Informat Technol, Dept Comp Sci, Wah Cantt, Pakistan
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2019年 / 22卷 / Suppl 3期
关键词
Heterogeneous data processing; Imbalanced learning; Intelligent computing; CLASSIFICATION; ALGORITHMS; LEXICON; WORDS;
D O I
10.1007/s10586-017-1077-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid growth of web technology and easy access of internet, online shopping has been increased. Now people express their opinions and share their experiences that greatly influence new buyers for purchasing products, thereby generating large data sets. This large data is very helpful for analyzing customer preference, needs and its behavior toward a product. Companies face the challenge of analyzing this sheer amount of data to extract customer opinion. To address this challenge, in this paper, we performed sentiment analysis on the customer review real-world data at phrase level to find out customer preference by analyzing subjective expressions. Then we calculated the strength of sentiment word to find out the intensity of each expression and applied clustering for placing the words in various clusters based on their intensity. We also compared the results of our technique with star-ranking given on the same dataset and found the drastic change in our results. We also provide a visual representation of our results to provide a clear insight of customer preference and behavior to help decision makers for better decision making.
引用
收藏
页码:S7149 / S7164
页数:16
相关论文
共 50 条
  • [1] Opinion mining on large scale data using sentiment analysis and k-means clustering
    Sumbal Riaz
    Mehvish Fatima
    M. Kamran
    M. Wasif Nisar
    Cluster Computing, 2019, 22 : 7149 - 7164
  • [2] Large scale K-means clustering using GPUs
    Li, Mi
    Frank, Eibe
    Pfahringer, Bernhard
    DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 37 (01) : 67 - 109
  • [3] Hierarchical K-means Method for Clustering Large-Scale Advanced Metering Infrastructure Data
    Xu, Tian-Shi
    Chiang, Hsiao-Dong
    Liu, Guang-Yi
    Tan, Chin-Woo
    IEEE TRANSACTIONS ON POWER DELIVERY, 2017, 32 (02) : 609 - 616
  • [4] K-Means Clustering With Incomplete Data
    Wang, Siwei
    Li, Miaomiao
    Hu, Ning
    Zhu, En
    Hu, Jingtao
    Liu, Xinwang
    Yin, Jianping
    IEEE ACCESS, 2019, 7 : 69162 - 69171
  • [5] SVM and k-Means Hybrid Method for Textual Data Sentiment Analysis
    Korovkinas, Konstantinas
    Danenas, Paulius
    Garsva, Gintautas
    BALTIC JOURNAL OF MODERN COMPUTING, 2019, 7 (01): : 47 - 60
  • [6] K-means Data Clustering with Memristor Networks
    Jeong, YeonJoo
    Lee, Jihang
    Moon, John
    Shin, Jong Hoon
    Lu, Wei D.
    NANO LETTERS, 2018, 18 (07) : 4447 - 4453
  • [7] An extension of the K-means algorithm to clustering skewed data
    Melnykov, Volodymyr
    Zhu, Xuwen
    COMPUTATIONAL STATISTICS, 2019, 34 (01) : 373 - 394
  • [8] Band depth based initialization of K-means for functional data clustering
    Albert-Smet, Javier
    Torrente, Aurora
    Romo, Juan
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2023, 17 (02) : 463 - 484
  • [9] Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
    Iparraguirre-Villanueva, Orlando
    Guevara-Ponce, Victor
    Sierra-Linan, Fernando
    Beltozar-Clemente, Saul
    Cabanillas-Carbone, Michael
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 571 - 578
  • [10] NMR metabolic analysis of samples using fuzzy K-means clustering
    Cuperlovic-Culf, Miroslava
    Belacel, Nabil
    Cuif, Adrian S.
    Chute, Ian C.
    Ouellette, Rodney J.
    Burton, Ian W.
    Karakach, Tobias K.
    Walter, John A.
    MAGNETIC RESONANCE IN CHEMISTRY, 2009, 47 : S96 - S104