Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections

被引:28
作者
El-Assady, Mennatallah [1 ,2 ]
Kehlbeck, Rebecca [1 ]
Collins, Christopher [2 ]
Keim, Daniel [1 ]
Deussen, Oliver [1 ]
机构
[1] Univ Konstanz, Constance, Germany
[2] Ontario Tech Univ, Oshawa, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Topic Model Optimization; Word Embedding; Mixed-Initiative Refinement; Guided Visual Analytics; Semantic Mapping; VISUAL ANALYTICS;
D O I
10.1109/TVCG.2019.2934654
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the topic modeling. These tasks are supported by an interactive visual analytics workspace that uses word-embedding projections to define concept regions which can then be refined. The user-refined concepts are independent of a particular document collection and can be transferred to related corpora. All user interactions within the concept space directly affect the semantic relations of the underlying vector space model, which, in turn, change the topic modeling. In addition to direct manipulation, our system guides the users decision-making process through recommended interactions that point out potential improvements. This targeted refinement aims at minimizing the feedback required for an efficient human-in-the-loop process. We confirm the improvements achieved through our approach in two user studies that show topic model quality improvements through our visual knowledge externalization and learning process.
引用
收藏
页码:1001 / 1011
页数:11
相关论文
共 59 条
[31]  
Finkel R. A., 1974, Acta Informatica, V4, P1, DOI 10.1007/BF00288933
[32]   A SWEEPLINE ALGORITHM FOR VORONOI DIAGRAMS [J].
FORTUNE, S .
ALGORITHMICA, 1987, 2 (02) :153-174
[33]   Considerations for Visualizing Comparison [J].
Gleicher, Michael .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2018, 24 (01) :413-423
[34]  
Halkidi M, 2002, SIGMOD REC, V31, P19, DOI 10.1145/601858.601862
[35]  
Hoque Enamul, 2015, P 20 INT C INT US IN, P169
[36]   Interactive topic modeling [J].
Hu, Yuening ;
Boyd-Graber, Jordan ;
Satinoff, Brianna ;
Smith, Alison .
MACHINE LEARNING, 2014, 95 (03) :423-469
[37]   A Systematic Review on the Practice of Evaluating Visualization [J].
Isenberg, Tobias ;
Isenberg, Petra ;
Chen, Jian ;
Sedlmair, Michael ;
Moeller, Torsten .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2013, 19 (12) :2818-2827
[38]   Effect of liquid nitrogen cooling on the permeability and mechanical characteristics of anisotropic shale [J].
Jiang, Long ;
Cheng, Yuanfang ;
Han, Zhongying ;
Gao, Qi ;
Yan, Chuanliang ;
Wang, Huaidong ;
Fu, Lipei .
JOURNAL OF PETROLEUM EXPLORATION AND PRODUCTION TECHNOLOGY, 2019, 9 (01) :111-124
[39]   Dependency-Based Word Embeddings [J].
Levy, Omer ;
Goldberg, Yoav .
PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2014, :302-308
[40]   Enhancing Topic Modeling for Short Texts with Auxiliary Word Embeddings [J].
Li, Chenliang ;
Duan, Yu ;
Wang, Haoran ;
Zhang, Zhiqian ;
Sun, Aixin ;
Ma, Zongyang .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2017, 36 (02)