Attention-based Unsupervised Keyphrase Extraction and Phrase Graph for COVID-19 Medical Literature Retrieval

被引:0
作者
Ding, Haoran [1 ]
Luo, Xiao [1 ]
机构
[1] Indiana Univ, Purdue Univ Indianapolis, 799 W Michigan St, Indianapolis, IN 46202 USA
来源
ACM TRANSACTIONS ON COMPUTING FOR HEALTHCARE | 2021年 / 3卷 / 01期
关键词
Keyphrase extraction; deep learning; medical information retrieval; COVID-19; INFORMATION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Searching, reading, and finding information from the massive medical text collections are challenging. A typical biomedical search engine is not feasible to navigate each article to find critical information or keyphrases. Moreover, few tools provide a visualization of the relevant phrases to the query. However, there is a need to extract the keyphrases from each document for indexing and efficient search. The transformer-based neural networks-BERT has been used for various natural language processing tasks. The built-in self-attention mechanism can capture the associations between words and phrases in a sentence. This research investigates whether the self-attentions can be utilized to extract keyphrases from a document in an unsupervised manner and identify relevancy between phrases to construct a query relevancy phrase graph to visualize the search corpus phrases on their relevancy and importance. The comparison with six baseline methods shows that the self- attention-based unsupervised keyphrase extraction works well on a medical literature dataset. This unsupervised keyphrase extraction model can also be applied to other text data. The query relevancy graph model is applied to the COVID-19 literature dataset and to demonstrate that the attention-based phrase graph can successfully identify the medical phrases relevant to the query terms.
引用
收藏
页数:16
相关论文
共 48 条
[1]  
Alsentzer E, 2019, Arxiv, DOI [arXiv:1904.03323, DOI 10.48550/ARXIV.1904.03323]
[2]  
Amer E, 2016, CAIRO INT BIOM ENG, P23, DOI 10.1109/CIBEC.2016.7836091
[3]  
[Anonymous], 2020, Global research on coronavirus disease (COVID-19)
[4]  
[Anonymous], EUR C INF RETR, P806
[5]  
[Anonymous], 2020, Symptoms of COVID-19
[6]  
[Anonymous], 2020, People with Certain Medical Conditions
[7]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[8]   DBpedia: A nucleus for a web of open data [J].
Auer, Soeren ;
Bizer, Christian ;
Kobilarov, Georgi ;
Lehmann, Jens ;
Cyganiak, Richard ;
Ives, Zachary .
SEMANTIC WEB, PROCEEDINGS, 2007, 4825 :722-+
[9]  
Beers M. H., 2016, Merck Diagnostic and Treatment Manual
[10]  
Bennani-Smires Kamil, 2018, P CONLL