A decade of research in statistics: a topic model approach

被引:0
作者
Francesca De Battisti
Alfio Ferrara
Silvia Salini
机构
[1] Università degli Studi di Milano,DEMM
[2] Università degli Studi di Milano,DI
来源
Scientometrics | 2015年 / 103卷
关键词
Probabilistic topic models; Scientometrics; Clustering; Text mining;
D O I
暂无
中图分类号
学科分类号
摘要
Topic models are a well known clustering approach for textual data, which provides promising applications in the bibliometric context for the purpose of discovering scientific topics and trends in a corpus of scientific publications. However, topic models per se provide poorly descriptive metadata featuring the discovered clusters of publications and they are not related to the other important metadata usually available with publications, such as authors affiliation, publication venue, and publication year. In this paper, we propose a methodological approach to topic modeling and post-processing of topic models results to the end of describing in depth a field of research over time. In particular, we work on a selection of publications from the international statistical literature, we propose an approach that allows us to identify sophisticated topic descriptors, and we analyze the links between topics and their temporal evolution.
引用
收藏
页码:413 / 433
页数:20
相关论文
共 21 条
[1]  
Blei DM(2012)Probabilistic topic models Communications of the ACM 55 77-84
[2]  
Blei DM(2007)A correlated topic model of science The Annals of Applied Statistics 1 17-35
[3]  
Lafferty JD(2003)Latent dirichlet allocation The Journal of Machine Learning Research 3 993-1022
[4]  
Blei DM(2012)Ten challenges in modeling bibliographic data for bibliometric analysis Scientometrics 93 765-787
[5]  
Ng AY(1997)Statistics on statistics: Measuring research productivity by journal publications between 1985 and 1995 The Canadian Journal of Statistics 25 427-433
[6]  
Jordan MI(1999)Probability and statistics: A tale of two worlds? The Canadian Journal of Statistics 27 421-444
[7]  
Ferrara A(2002)Worldwide research output in probability and statistics: An update The Canadian Journal of Statistics 30 329-342
[8]  
Salini S(2011)Topicsmodels: An R package for fitting topic models Journal of Statistical Software 40 1-30
[9]  
Genest C(2005)Power-law distributions for the citation index of scientific publications and scientists Brazilian Journal of Physics 35 981-986
[10]  
Genest C(2005)An index to quantify an individual’s scientific research output Proceedings of the National Academy of Sciences of the United States of America 102 16569-16572