Micro-Blog Topic Detection Method Based on BTM Topic Model and K-Means Clustering Algorithm

被引:32
|
作者
Li, Weijiang [1 ]
Feng, Yanming [1 ]
Li, Dongjun [2 ]
Yu, Zhengtao [1 ]
机构
[1] Kunming Univ Sci & Technol, Dept Informat Engn & Automat, Kunming 650500, Peoples R China
[2] Soochow Univ, Jinan Qingqi Peugeot Motorcycle Co Ltd, R&D Dept, Jinan 250104, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
short text; topic model; topic discovery; K-means clustering;
D O I
10.3103/S0146411616040040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The development of micro-blog, generating large-scale short texts, provides people with convenient communication. In the meantime, discovering topics from short texts genuinely becomes an intractable problem. It was hard for traditional topic model-to-model short texts, such as probabilistic latent semantic analysis (PLSA) and Latent Dirichlet Allocation (LDA). They suffered from the severe data sparsity when disposed short texts. Moreover, K-means clustering algorithm can make topics discriminative when datasets is intensive and the difference among topic documents is distinct. In this paper, BTM topic model is employed to process short texts - micro-blog data for alleviating the problem of sparsity. At the same time, we integrating K-means clustering algorithm into BTM (Biterm Topic Model) for topics discovery further. The results of experiments on Sina micro-blog short text collections demonstrate that our method can discover topics effectively.
引用
收藏
页码:271 / 277
页数:7
相关论文
共 50 条
  • [41] Detection algorithm of abnormal characteristics of urban domestic water quality based on K-means clustering
    Huang, Xiaoying
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL TECHNOLOGY AND MANAGEMENT, 2023, 26 (3-5) : 226 - 237
  • [42] A Novel Supervised Multi-model Modeling Method Based on k-means Clustering
    Liu, Linlin
    Zhou, Lifang
    Xie, Shenggang
    2010 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-5, 2010, : 684 - 689
  • [43] Model Based Modified K-Means Clustering for Microarray Data
    Suresh, R. M.
    Dinakaran, K.
    Valarmathie, P.
    2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT AND ENGINEERING, PROCEEDINGS, 2009, : 271 - 273
  • [44] Corn Straw Coverage Calculation Algorithm Based on K-means Clustering and Zoning Optimization Method
    An X.
    Wang P.
    Luo C.
    Meng Z.
    Chen L.
    Zhang A.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (10): : 84 - 89
  • [45] A Fast and Effective Kernel-Based K-Means Clustering Algorithm
    Kong Dexi
    Kong Rui
    2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA), 2013, : 58 - 61
  • [46] An Improved K-means Clustering Algorithm Based on Meliorated Initial Centre
    Li, Xiang
    Wei, Zhenwei
    Li, Lingling
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRIAL ENGINEERING (AIIE 2016), 2016, 133 : 73 - 76
  • [47] Multi group sparrow search algorithm based on K-means clustering
    Yan S.
    Liu W.
    Yang P.
    Wu F.
    Yan Z.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (02): : 508 - 518
  • [48] A passive islanding detection method based on K-means clustering and EMD of reactive power signal
    Thomas, Sindhura Rose
    Kurupath, Venugopalan
    Nair, Usha
    SUSTAINABLE ENERGY GRIDS & NETWORKS, 2020, 23
  • [49] A Word's Difficulty Level Classification Model Based on Random Forest Algorithm and K-Means Clustering Algorithm
    Ning, Jiajie
    Huang, Feifan
    Yin, Maoyuan
    2023 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYTICS, ICCCBDA, 2023, : 143 - 146
  • [50] Micro-Blood Vessel Detection Using K-means Clustering and Morphological Thinning
    Luo, Zhongming
    Liu, Zhuofu
    Li, Junfu
    ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT III, 2011, 6677 : 348 - 354