TOPICVIEW: VISUAL ANALYSIS OF TOPIC MODELS AND THEIR IMPACT ON DOCUMENT CLUSTERING

被引:2
|
作者
Crossno, Patricia J. [1 ]
Wilson, Andrew T. [1 ]
Shead, Timothy M. [1 ]
Davis, Warren L. [1 ]
Dunlavy, Daniel M. [1 ]
机构
[1] Sandia Natl Labs, Albuquerque, NM 87185 USA
关键词
Text analysis; visual model analysis; latent semantic analysis; latent dirichlet allocation; clustering;
D O I
10.1142/S0218213013600087
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new approach for analyzing topic models using visual analytics. We have developed Topic View, an application for visually comparing and exploring multiple models of text corpora, as a prototype for this type of analysis tool. Topic View uses multiple linked views to visually analyze conceptual and topical content, document relationships identified by models, and the impact of models on the results of document clustering. As case studies, we examine models created using two standard approaches: Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Conceptual content is compared through the combination of (i) a bipartite graph matching LSA concepts with LDA topics based on the cosine similarities of model factors and (ii) a table containing the terms for each LSA concept and LDA topic listed in decreasing order of importance. Document relationships are examined through the combination of (i) side-by-side document similarity graphs, (ii) a table listing the weights for each document's contribution to each concept/topic, and (iii) a full text reader for documents selected in either of the graphs or the table. The impact of LSA and LDA models on document clustering applications is explored through similar means, using proximities between documents and cluster exemplars for graph layout edge weighting and table entries. We demonstrate the utility of Topic View's visual approach to model assessment by comparing LSA and LDA models of several example corpora.
引用
收藏
页数:36
相关论文
共 50 条
  • [21] Formal concept analysis for topic detection: A clustering quality experimental analysis
    Castellanos, A.
    Cigarran, J.
    Garcia-Serrano, A.
    INFORMATION SYSTEMS, 2017, 66 : 24 - 42
  • [22] An improved clustering method based on biological visual models
    Rodriguez, Alma
    Cuevas, Erik
    Zaldivar, Daniel
    Perez-Cisneros, Marco
    Garcia-Gil, Gerardo
    Morales-Castaneda, Bernardo
    APPLIED MATHEMATICAL MODELLING, 2020, 85 : 174 - 191
  • [23] Semantic topic models for source code analysis
    Mahmoud, Anas
    Bradshaw, Gary
    EMPIRICAL SOFTWARE ENGINEERING, 2017, 22 (04) : 1965 - 2000
  • [24] Exploring performance of clustering methods on document sentiment analysis
    Ma, Baojun
    Yuan, Hua
    Wu, Ye
    JOURNAL OF INFORMATION SCIENCE, 2017, 43 (01) : 54 - 74
  • [25] Semantic topic models for source code analysis
    Anas Mahmoud
    Gary Bradshaw
    Empirical Software Engineering, 2017, 22 : 1965 - 2000
  • [26] Visual analysis of dynamic networks with geological clustering
    Ahmed, Adel
    Fu, Xiaoyan
    Hong, Seok-Hee
    Nguyen, Quan Hoang
    Xu, Kai
    VAST: IEEE SYMPOSIUM ON VISUAL ANALYTICS SCIENCE AND TECHNOLOGY 2007, PROCEEDINGS, 2007, : 221 - +
  • [27] Social-Network Analysis Using Topic Models
    Cha, Youngchul
    Cho, Junghoo
    SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 565 - 574
  • [28] Document Clustering for Forensic Analysis: An Approach for Improving Computer Inspection
    da Cruz Nassif, Luis Filipe
    Hruschka, Eduardo Raul
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2013, 8 (01) : 46 - 54
  • [29] Incorporating Popularity in Topic Models for Social Network Analysis
    Cha, Youngchul
    Bi, Bin
    Hsieh, Chu-Cheng
    Cho, Junghoo
    SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, 2013, : 223 - 232
  • [30] Visual Exploration Tools for Ensemble Clustering Analysis
    Fiol-Gonzalez, Sonia
    Almeida, Cassio F. P.
    Rodrigues, Ariane M. B.
    Barbosa, Simone D. J.
    Lopes, Helio
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL 3: IVAPP, 2019, : 259 - 266