Topic Modeling: Perspectives From a Literature Review

被引:32
作者
Grisales, A. Andres M. [1 ]
Robledo, Sebastian [1 ]
Zuluaga, Martha [2 ]
机构
[1] Univ Catol Luis Amigo, Fac Adm Econ & Accounting Sci, Medellin 050004, Colombia
[2] Univ Nacl Abierta & Distancia UNAD, Dosquebradas 661007, Colombia
关键词
Natural language processing; Bibliometrics; Databases; Codes; Bibliographies; Data models; Systematics; Machine learning; Literature review; machine learning; natural language processing; scientometrics; topic modeling; SHORT TEXT; WORDS; FRAMEWORK; NETWORK; SCIENCE;
D O I
10.1109/ACCESS.2022.3232939
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Topic modeling is a Natural Language Processing technique that has gained popularity over the last ten years, with applications in multiple fields of knowledge. However, there is insufficient empirical evidence to show how this field of study has developed over the years, as well as the main models that have been applied in different contexts. The objective of this paper is to analyze the evolution of the topic modeling technique, the main areas in which it has been applied, and the models that are recommended for specific types of data. The methodology applied is based on bibliometric analysis. First, we searched the Web of Science and the Scopus databases. We then used scientometric techniques and a Tree of Science methodology, which allowed us to analyze the search results from the perspectives of classics, structure, and trends. The results show that the USA and China are among the most productive countries in this field and the applications have been mainly in the identification of sub-topics in short texts, such as social networks and blogs. The main conclusion of this work is that topic modeling is a versatile technique that can complement systematic literature reviews and that has been well-received in different academic and research contexts. The results of this study will help researchers and academics to recognize the importance of these techniques for reviewing large volumes of unstructured information, such as research articles, and in general, for systematic literature reviews.
引用
收藏
页码:4066 / 4078
页数:13
相关论文
共 111 条
[1]   Use of bibliometrics for research evaluation in emerging markets economies: a review and discussion of bibliometric indicators [J].
Al-Jamimi, Hamdi A. ;
BinMakhashen, Galal M. ;
Bornmann, Lutz .
SCIENTOMETRICS, 2022, 127 (10) :5879-5930
[2]  
Alfred R., 2021, IAENG INT J COMPUT S, V48, P32
[3]   Enhancing topic clustering for Arabic security news based on k-means and topic modelling [J].
Alharbi, Adel R. ;
Hijji, Mohammad ;
Aljaedi, Amer .
IET NETWORKS, 2021, 10 (06) :278-294
[4]   Analysis of dynamic networks based on the lsing model for the case of study of co-authorship of scientific articles [J].
Andrea Hurtado-Marin, V ;
Dario Agudelo-Giraldo, J. ;
Robledo, Sebastian ;
Restrepo-Parra, Elisabeth .
SCIENTIFIC REPORTS, 2021, 11 (01)
[5]   bibliometrix: An R-tool for comprehensive science mapping analysis [J].
Aria, Massimo ;
Cuccurullo, Corrado .
JOURNAL OF INFORMETRICS, 2017, 11 (04) :959-975
[6]  
Baghmohammad M., 2020, IRANIAN J INF PROCES, V36, P297
[7]  
Barde BV, 2017, 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), P745, DOI 10.1109/ICCONS.2017.8250563
[8]  
Bastian M., 2009, INT AAAI C WEBL SOC
[9]   Machine Learning and Big Data in the Impact Literature. A Bibliometric Review with Scientific Mapping in Web of Science [J].
Belmonte, Jesus Lopez ;
Segura-Robles, Adrian ;
Moreno-Guerrero, Antonio-Jose ;
Elena Parra-Gonzalez, Maria .
SYMMETRY-BASEL, 2020, 12 (04)
[10]   A general framework to expand short text for topic modeling [J].
Bicalho, Paulo ;
Pita, Marcelo ;
Pedrosa, Gabriel ;
Lacerda, Anisio ;
Pappa, Gisele L. .
INFORMATION SCIENCES, 2017, 393 :66-81