Detecting emerging topics by exploiting probability burst and association rule mining: A case study of Library and Information Science

被引:4
作者
Xu, Min [1 ]
Li, Guangjian [1 ]
Wang, Xiaodi [1 ]
机构
[1] Peking Univ, Dept Informat Management, Beijing 100871, Peoples R China
关键词
Latent Dirichlet Allocation; Emerging topic detection; Probability burst; Association rule mining; Library and Information Science research; RESEARCH FRONTS; LDA; EVOLUTION; CITATION;
D O I
10.22452/mjlis.vol25no1.3
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
The primary reason for detecting emerging topics is to reduce researchers' time in finding current related topic while maintaining awareness of current trends in a particular field. Nowadays, the amount of information is growing rapidly, but tracking the development of a research field by manually reading the literature is challenging. This study takes Library and Information Science (LIS) as a case study to present a new method for detecting emerging topics. This novel method could be applied to analyse various types of documents and detect emerging topics automatically. This method utilizes a Latent Dirichlet Allocation (LDA) model to generate topics and calculate probabilities. It discovers emerging topics by detecting probability burst in consecutive time spans. Association rule mining and lexical similarity computation are adopted to represent the topics. This work tests the method by comparing the results of emerging topics from the LIS data in the baseline paper. The validation demonstrates that the proposed approach is feasible.
引用
收藏
页码:47 / 66
页数:20
相关论文
共 37 条
[1]  
Allan J., 1998, P DARPA BROADC NEWS, P194
[2]   On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking [J].
AlSumait, Loulwah ;
Barbara, Daniel ;
Domeniconi, Carlotta .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :3-12
[3]  
Blei D., 2008, ADV NEURAL INFORM PR, V3, P327
[4]  
Blei D., 2011, P 17 ACM SIGKDD INT, DOI [DOI 10.1145/2107736.2107741, 10.1145/2107736.2107741]
[5]  
Blei D., 2005, Advances in Neural Information Processing Systems (NeurIPS), V18, P147
[6]  
Blei D.M., 2006, P 23 INT C MACH LEAR, DOI DOI 10.1145/1143844.1143859
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]  
Bolelli L, 2009, LECT NOTES COMPUT SC, V5478, P776, DOI 10.1007/978-3-642-00958-7_84
[9]   Co-Citation Analysis, Bibliographic Coupling, and Direct Citation: Which Citation Approach Represents the Research Front Most Accurately? [J].
Boyack, Kevin W. ;
Klavans, Richard .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (12) :2389-2404
[10]  
Boyd-Graber J., 2008, P ADV NEURAL INFORM, P185