Exploring the Space of Topic Coherence Measures

被引:1098
作者
Roeder, Michael [1 ,2 ]
Both, Andreas [2 ]
Hinneburg, Alexander [3 ]
机构
[1] Univ Leipzig, Leipzig, Germany
[2] Unister GmbH, R&D, Leipzig, Germany
[3] Martin Luther Univ Halle Wittenberg, Halle, Germany
来源
WSDM'15: PROCEEDINGS OF THE EIGHTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING | 2015年
关键词
topic evaluation; topic coherence; topic model;
D O I
10.1145/2684822.2685324
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Quantifying the coherence of a set of statements is a long standing problem with many potential applications that has attracted researchers from different sciences. The special case of measuring coherence of topics has been recently studied to remedy the problem that topic models give no guaranty on the interpretablity of their output. Several benchmark datasets were produced that record human judgements of the interpretability of topics. We are the first to propose a framework that allows to construct existing word based coherence measures as well as new ones by combining elementary components. We conduct a systematic search of the space of coherence measures using all publicly available topic relevance data for the evaluation. Our results show that new combinations of components outperform existing measures with respect to correlation to human ratings. Finally, we outline how our results can be transferred to further applications in the context of text mining, information retrieval and the world wide web.
引用
收藏
页码:399 / 408
页数:10
相关论文
共 17 条
  • [1] Aletras N, 2013, P 10 INT C COMP SEM, DOI [10.1145/2537052, DOI 10.1145/2537052]
  • [2] AlSumait L, 2009, LECT NOTES ARTIF INT, V5781, P67, DOI 10.1007/978-3-642-04180-8_22
  • [3] [Anonymous], 2009, P BIENN GSCL C
  • [4] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [5] Bovens L., 2003, Bayesian epistemology
  • [6] Chang J., 2009, Adv. Neural Inf. Process. Syst., V22, DOI DOI 10.5555/2984093.2984126
  • [7] Measuring coherence
    Douven, Igor
    Meijs, Wouter
    [J]. SYNTHESE, 2007, 156 (03) : 405 - 425
  • [8] A probabilistic theory of coherence
    Fitelson, B
    [J]. ANALYSIS, 2003, 63 (03) : 194 - 199
  • [9] Hinneburg Alexander, 2012, Machine Learning and Knowledge Discovery in Databases. Proceedings of the European Conference (ECML PKDD 2012), P838, DOI 10.1007/978-3-642-33486-3_59
  • [10] Lau J. H., 2014, P EUR CHAP ASS COMP