Interactive Topic Modeling for aiding Qualitative Content Analysis

被引:15
作者
Bakharia, Aneesha [1 ]
Bruza, Peter [1 ]
Watters, Jim [1 ]
Narayan, Bhuva [2 ]
Sitbon, Laurianne [1 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld, Australia
[2] Univ Technol Sydney, Sydney, NSW, Australia
来源
PROCEEDINGS OF THE 2016 ACM CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL (CHIIR'16) | 2016年
关键词
Topic Modeling; Content Analysis; Latent Dirichlet Allocation; Non-negative Matrix Factorisation;
D O I
10.1145/2854946.2854960
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Topic Modeling algorithms are rarely used to support the qualitative content analysis process. The main contributing factors for the lack of mainstream adoption can be attributed to the perception that Topic Modeling produces topics of poor quality and that content analysts do not trust the derived topics because they are unable to supply domain knowledge and interact with the algorithm. In this paper, interactive Topic Modeling algorithms namely Dirichlet-Forrest Latent Dirichlet Allocation and Penalised Non-negative Matrix Factorisation, are evaluated with respect to their ability to aid qualitative content analysis. More specifically, the relationship between interactivity, interpretation, topic coherence and trust in interactive content analysis is examined. The findings indicate that providing content analysts with the ability to interact with Topic Modeling algorithms produces topics that are directly related to their research questions. However, a number of improvements to these algorithms were also identified which have the potential to influence future algorithm development to better meet the requirements of qualitative content analysts.
引用
收藏
页码:213 / 222
页数:10
相关论文
共 16 条
  • [1] Andrzejewski David, 2009, Proc Int Conf Mach Learn, V382, P25
  • [2] [Anonymous], The Sage handbook of qualitative research (443-466)
  • [3] [Anonymous], 2001, ICML
  • [4] [Anonymous], 2011, P INT JOINT C ART IN
  • [5] Bakharia A., 2014, THESIS, V8
  • [6] Basu S, 2009, CH CRC DATA MIN KNOW, P1
  • [7] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [8] Graph Regularized Nonnegative Matrix Factorization for Data Representation
    Cai, Deng
    He, Xiaofei
    Han, Jiawei
    Huang, Thomas S.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (08) : 1548 - 1560
  • [9] Finding scientific topics
    Griffiths, TL
    Steyvers, M
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 : 5228 - 5235
  • [10] Three approaches to qualitative content analysis
    Hsieh, HF
    Shannon, SE
    [J]. QUALITATIVE HEALTH RESEARCH, 2005, 15 (09) : 1277 - 1288