Opinion mining on large scale data using sentiment analysis and k-means clustering

被引:46
作者
Riaz, Sumbal [1 ]
Fatima, Mehvish [1 ]
Kamran, M. [1 ]
Nisar, M. Wasif [1 ]
机构
[1] COMSATS Inst Informat Technol, Dept Comp Sci, Wah Cantt, Pakistan
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2019年 / 22卷 / Suppl 3期
关键词
Heterogeneous data processing; Imbalanced learning; Intelligent computing; CLASSIFICATION; ALGORITHMS; LEXICON; WORDS;
D O I
10.1007/s10586-017-1077-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid growth of web technology and easy access of internet, online shopping has been increased. Now people express their opinions and share their experiences that greatly influence new buyers for purchasing products, thereby generating large data sets. This large data is very helpful for analyzing customer preference, needs and its behavior toward a product. Companies face the challenge of analyzing this sheer amount of data to extract customer opinion. To address this challenge, in this paper, we performed sentiment analysis on the customer review real-world data at phrase level to find out customer preference by analyzing subjective expressions. Then we calculated the strength of sentiment word to find out the intensity of each expression and applied clustering for placing the words in various clusters based on their intensity. We also compared the results of our technique with star-ranking given on the same dataset and found the drastic change in our results. We also provide a visual representation of our results to provide a clear insight of customer preference and behavior to help decision makers for better decision making.
引用
收藏
页码:S7149 / S7164
页数:16
相关论文
共 50 条
  • [31] An Efficient MapReduce-based Adaptive K-Means Clustering for Large Dataset
    Chowdhury, Tapan
    Mukherjee, Arijit
    Chakraborty, Susanta
    2017 3RD IEEE INTERNATIONAL SYMPOSIUM ON NANOELECTRONIC AND INFORMATION SYSTEMS (INIS), 2017, : 157 - 162
  • [32] On Euclidean k-Means Clustering with α-Center Proximity
    Deshpande, Amit
    Louis, Anand
    Singh, Apoorv Vikram
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [33] A Novel MapReduce Based k-Means Clustering
    Sinha, Ankita
    Jana, Prasanta K.
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COMMUNICATION, 2017, 458 : 247 - 255
  • [34] Ensemble-Initialized k-Means Clustering
    Xu, Shasha
    Huang, Dong
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 59 - 63
  • [35] Application of k-means clustering in psychological studies
    Zakharov, Kyrylo
    QUANTITATIVE METHODS FOR PSYCHOLOGY, 2016, 12 (02): : 87 - 100
  • [36] Centroid Update Approach to K-Means Clustering
    Borlea, Ioan-Daniel
    Precup, Radu-Emil
    Dragan, Florin
    Borlea, Alexandra-Bianca
    ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2017, 17 (04) : 3 - 10
  • [37] An Integrated Clustering Framework Using Optimized K-means with Firefly and Canopies
    Nayak, S.
    Panda, C.
    Xalxo, Z.
    Behera, H. S.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 2, 2015, 32 : 333 - 343
  • [38] Managing the Conditions for Project Success: An Approach Using k-means Clustering
    de Souza, Luciano Azevedo
    Costa, Helder Gomes
    HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 396 - 406
  • [39] Online K-Means Clustering with Lightweight Coresets
    Low, Jia Shun
    Ghafoori, Zahra
    Leckie, Christopher
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 191 - 202
  • [40] Handwritten Hindi Character Recognition using K-Means Clustering and SVM
    Gaur, Akanksha
    Yadav, Sunita
    2015 4TH INTERNATIONAL SYMPOSIUM ON EMERGING TRENDS AND TECHNOLOGIES IN LIBRARIES AND INFORMATION SERVICES (ETTLIS), 2015, : 65 - 70