Topic Mining for Call Centers Based on LDA

被引:0
作者
Guo, Wenming [1 ]
Deng, Tianlang [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Software Engn, Beijing 100876, Peoples R China
来源
2014 10TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC) | 2014年
关键词
LDA; topic mining; call-centers; A-LDA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Latent Dirichlet Allocation, which is a non-supervised learning method, can be used for topic detection, automatic text categorization, keyword extraction and so on. It only focuses on the text itself, not considering other external correlation properties. External association property refers to some structured attributes that correspondence with the text data, for example, a paper usually has several properties like authors, publishing time etc. A telephone call usually has several properties like caller number, call time etc. To iron out flaws; we propose an improved model A-LDA based LDA. We use data sets from telephone call centers (a kind of data centers in rapid growth) to experiment on topic detection. The topic results show that A-LDA with introduce of external correlation properties, compared with the traditional LDA, is decreased in perplexity value and has better generalization performance. At the same time, we can obtain the topic that external attributes contained.
引用
收藏
页码:839 / 844
页数:6
相关论文
共 12 条
  • [1] Particle Markov chain Monte Carlo methods
    Andrieu, Christophe
    Doucet, Arnaud
    Holenstein, Roman
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2010, 72 : 269 - 342
  • [2] Balasubramanyan Ramnath, 2011, SDM, V11
  • [3] Bishop CM, 1998, NATO ADV SCI I D-BEH, V89, P371
  • [4] Probabilistic Topic Models
    Blei, David M.
    [J]. COMMUNICATIONS OF THE ACM, 2012, 55 (04) : 77 - 84
  • [5] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [6] Foulds J, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P446
  • [7] Finding scientific topics
    Griffiths, TL
    Steyvers, M
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 : 5228 - 5235
  • [8] Heinrich G., 2005, PARAMETER ESTIMATION
  • [9] Hoffman MatthewD., 2010, NIPS, V2
  • [10] Probabilistic latent semantic indexing
    Hofmann, T
    [J]. SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, : 50 - 57