Demystifying COVID-19 publications: institutions, journals, concepts, and topics

被引:3
作者
Chen, Haihua [1 ]
Chen, Jiangping [1 ]
Nguyen, Huyen [1 ]
机构
[1] Univ North Texas, Dept Informat Sci, Denton, TX 76203 USA
关键词
COVID-19; pandemic; CORD-19; dataset; global research roadmap; data analytics; topic modeling; EVOLUTION;
D O I
10.5195/jmla.2021.1141
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Objective: We analyzed the COVID-19 Open Research Dataset (CORD-19) to understand leading research institutions, collaborations among institutions, major publication venues, key research concepts, and topics covered by pandemic related research. Methods: We conducted a descriptive analysis of authors' institutions and relationships, automatic content extraction of key words and phrases from titles and abstracts, and topic modeling and evolution. Data visualization techniques were applied to present the results of the analysis. Results: We found that leading research institutions on COVID-19 included the Chinese Academy of Sciences, the US National Institutes of Health, and the University of California. Research studies mostly involved collaboration among different institutions at national and international levels. In addition to bioRxiv, major publication venues included journals such as The BMJ, PLOS One, Journal of Virology, and The Lancet. Key research concepts included the coronavirus, acute respiratory impairments, health care, and social distancing. The ten most popular topics were identified through topic modeling and included human metapneumovirus and livestock, clinical outcomes of severe patients, and risk factors for higher mortality rate. Conclusion: Data analytics is a powerful approach for quickly processing and understanding large-scale datasets like CORD-19. This approach could help medical librarians, researchers, and the public understand important characteristics of COVID-19 research and could be applied to the analysis of other large datasets.
引用
收藏
页码:395 / 405
页数:11
相关论文
共 29 条
[1]  
Alghamdi R, 2015, INT J ADV COMPUT SC, V6, P147
[2]   Artificial intelligence and machine learning to fight COVID-19 [J].
Alimadadi, Ahmad ;
Aryal, Sachin ;
Manandhar, Ishan ;
Munroe, Patricia B. ;
Joe, Bina ;
Cheng, Xi .
PHYSIOLOGICAL GENOMICS, 2020, 52 (04) :200-202
[3]   Impact of preprocessing on medical data classification [J].
Almuhaideb, Sarab ;
Menai, Mohamed El Bachir .
FRONTIERS OF COMPUTER SCIENCE, 2016, 10 (06) :1082-1102
[4]  
[Anonymous], 2020, COVID 19 KAGGL COMM
[5]  
Bioinformatics Organization, 2016, JOURNALS
[6]  
Blei David M, 2006, Proceedings of the 23rd international conference on Machine learning, P113, DOI DOI 10.1145/1143844.1143859
[7]  
Bougouin Adrien, 2013, IJCNLP, P543
[8]   YAKE! Keyword extraction from single documents using multiple local features [J].
Campos, Ricardo ;
Mangaravite, Vitor ;
Pasquali, Arian ;
Jorge, Alipio ;
Nunes, Celia ;
Jatowt, Adam .
INFORMATION SCIENCES, 2020, 509 :257-289
[9]  
Chen M., 2018, Data and Information Management, V2, DOI [DOI 10.2478/DIM-2018-0015, 10.2478/dim-2018-0015]
[10]  
Fister I Jr, 2020, ARXIV200403397