How Hierarchical Topics Evolve in Large Text Corpora

被引:91
作者
Cui, Weiwei
Liu, Shixia
Wu, Zhuofeng [1 ]
Wei, Hao [2 ]
机构
[1] Nankai Univ, Tianjin, Peoples R China
[2] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
关键词
Hierarchical topic visualization; evolutionary tree clustering; data transformation; COLLECTIONS; DESIGN;
D O I
10.1109/TVCG.2014.2346433
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Using a sequence of topic trees to organize documents is a popular way to represent hierarchical and evolving topics in text corpora. However, following evolving topics in the context of topic trees remains difficult for users. To address this issue, we present an interactive visual text analysis approach to allow users to progressively explore and analyze the complex evolutionary patterns of hierarchical topics. The key idea behind our approach is to exploit a tree cut to approximate each tree and allow users to interactively modify the tree cuts based on their interests. In particular, we propose an incremental evolutionary tree cut algorithm with the goal of balancing 1) the fitness of each tree cut and the smoothness between adjacent tree cuts; 2) the historical and new information related to user interests. A time-based visualization is designed to illustrate the evolving topics over time. To preserve the mental map; we develop a stable layout algorithm. As a result, our approach can quickly guide users to progressively gain profound insights into evolving hierarchical topics. We evaluate the effectiveness of the proposed method on Amazon's Mechanical Turk and real-world news data. The results show that users are able to successfully analyze evolving topics in text data.
引用
收藏
页码:2281 / 2290
页数:10
相关论文
共 42 条
[1]  
[Anonymous], 2010, P 26 C UNCERTAINTY A
[2]  
[Anonymous], 2004, Lucene in Action
[3]  
[Anonymous], 1986, P SIGCHI C HUMAN FAC, DOI DOI 10.1145/22339.22342
[4]  
Archambault D., 2012, P INT S GRAPH DRAW, P475
[5]   Animation, Small Multiples, and the Effect of Mental Map Preservation in Dynamic Graphs [J].
Archambault, Daniel ;
Purchase, Helen C. ;
Pinaud, Bruno .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2011, 17 (04) :539-552
[6]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[7]  
Boykov YY, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL I, PROCEEDINGS, P105, DOI 10.1109/ICCV.2001.937505
[8]  
Card S.K., 2002, Proceedings of the 2002 Working Conference on Advanced Visual Interfaces, P231, DOI [DOI 10.1145/1556262.1556300, 10.1145/1556262.1556300]
[9]  
Collberg C., 2003, Proceedings of the 2003 ACM symposium on Software visualization-SoftVis '03, P77
[10]   Mean shift: A robust approach toward feature space analysis [J].
Comaniciu, D ;
Meer, P .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (05) :603-619