A decade of research in statistics: a topic model approach

被引:57
作者
De Battisti, Francesca [1 ]
Ferrara, Alfio [2 ]
Salini, Silvia [1 ]
机构
[1] Univ Milan, DEMM, Milan, Italy
[2] Univ Milan, DI, Milan, Italy
关键词
Probabilistic topic models; Scientometrics; Clustering; Text mining; RESEARCH OUTPUT; PROBABILITY; PUBLICATIONS; INDEX;
D O I
10.1007/s11192-015-1554-1
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Topic models are a well known clustering approach for textual data, which provides promising applications in the bibliometric context for the purpose of discovering scientific topics and trends in a corpus of scientific publications. However, topic models per se provide poorly descriptive metadata featuring the discovered clusters of publications and they are not related to the other important metadata usually available with publications, such as authors affiliation, publication venue, and publication year. In this paper, we propose a methodological approach to topic modeling and post-processing of topic models results to the end of describing in depth a field of research over time. In particular, we work on a selection of publications from the international statistical literature, we propose an approach that allows us to identify sophisticated topic descriptors, and we analyze the links between topics and their temporal evolution.
引用
收藏
页码:413 / 433
页数:21
相关论文
共 17 条
[1]  
[Anonymous], ARXIVCONDMAT0412004V
[2]   A CORRELATED TOPIC MODEL OF SCIENCE [J].
Blei, David M. ;
Lafferty, John D. .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :17-35
[3]   Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[4]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[5]   Ten challenges in modeling bibliographic data for bibliometric analysis [J].
Ferrara, Alfio ;
Salini, Silvia .
SCIENTOMETRICS, 2012, 93 (03) :765-785
[6]   Probability and statistics: A tale of two worlds? [J].
Genest, C .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1999, 27 (02) :421-444
[7]   Statistics on statistics: measuring research productivity by journal publications between 1985 and 1995 [J].
Genest, C .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1997, 25 (04) :427-443
[8]   Worldwide research output in probability and statistics: an update [J].
Genest, C ;
Guay, M .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2002, 30 (02) :329-342
[9]  
Grün B, 2011, J STAT SOFTW, V40, P1
[10]   Power-law distributions for the citation index of scientific publications and scientists [J].
Gupta, HM ;
Campanha, JR ;
Pesce, RAG .
BRAZILIAN JOURNAL OF PHYSICS, 2005, 35 (4A) :981-986