A novel sampling-based visual topic models with computational intelligence for big social health data clustering

被引:5
作者
Narasimhulu, K. [1 ]
Abarna, K. T. Meena [1 ]
Kumar, B. Siva [1 ,2 ]
Suresh, T. [1 ]
机构
[1] Annamalai Univ, Chidambaram, Tamil Nadu, India
[2] Rajeev Gandhi Mem Coll Engn & Technol, Dept CSE, Nandyal, Andhra Pradesh, India
关键词
Big social data; Health cluster tendency; Visual techniques; Topic models; Tweet data; ALGORITHMS;
D O I
10.1007/s11227-021-04300-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter is a popular social network for people to share views or opinions on various topics. Many people search for health topics through Twitter; thus, obtaining a vast amount of social health data from Twitter is possible. Topic models are widely used for social health-care data clustering. These models require prior knowledge about the clustering tendency. Determining the number of clusters of given social health data is known as the health cluster tendency. Visual techniques, including visual assessment of the cluster tendency, cosine-based, and multiviewpoint-based cosine similarity features VAT (MVCS-VAT), are used to identify social health cluster tendencies. The recent MVCS-VAT technique is superior to others; however, it is the most expensive technique for big social health data cluster assessment. Thus, this paper aims to enhance the work of the MVCS-VAT using a sampling technique to address the big social health data assessment problem. Experimental is conducted on different health datasets for demonstrating an efficiency of proposed work. Accuracy of social health data clustering is improved at a rate of 5 to 10% in the proposed S-MVCS-VAT when compared to MVCS-VAT. From obtained results, it also proved that the proposed S-MVCS-VAT is a faster and memory efficient for discovering social health data clusters.
引用
收藏
页码:9619 / 9641
页数:23
相关论文
共 29 条
[1]   RETRACTED: Big data analytic diabetics using map reduce and classification techniques (Retracted Article) [J].
AlZubi, Ahmad Ali .
JOURNAL OF SUPERCOMPUTING, 2020, 76 (06) :4328-4337
[2]   Is Normalized Mutual Information a Fair Measure for Comparing Community Detection Methods? [J].
Amelio, Alessia ;
Pizzuti, Clara .
PROCEEDINGS OF THE 2015 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2015), 2015, :1584-1585
[3]  
[Anonymous], 2018, ICMSS2018
[4]  
Basha, 2020, J MECH CONT MATH SCI, DOI [10.26782/jmcms.2020.08.00061, DOI 10.26782/JMCMS.2020.08.00061]
[5]   Sampling-based visual assessment computing techniques for an efficient social data clustering [J].
Basha, M. Suleman ;
Mouleeswaran, S. K. ;
Prasad, K. Rajendra .
JOURNAL OF SUPERCOMPUTING, 2021, 77 (08) :8013-8037
[6]  
Basha M Suleman, 2019, INT J INNOVATIVE TEC
[7]   VAT: A tool for visual assessment of (cluster) tendency [J].
Bezdek, JC ;
Hathaway, RJ .
PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, :2225-2230
[8]   Comparative Performance Evaluation of Clustering Algorithms for Grouping Manufacturing Firms [J].
Bhatnagar, Vikas ;
Majhi, Ritanjali ;
Jena, Pradyot Ranjan .
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2018, 43 (08) :4071-4083
[9]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[10]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO