Graph-based Word Sense Disambiguation of biomedical documents

被引:36
作者
Agirre, Eneko [1 ]
Soroa, Aitor [1 ]
Stevenson, Mark [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btq555
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Word Sense Disambiguation (WSD), automatically identifying the meaning of ambiguous words in context, is an important stage of text processing. This article presents a graphbased approach to WSD in the biomedical domain. The method is unsupervised and does not require any labeled training data. It makes use of knowledge from the Unified Medical Language System ( UMLS) Metathesaurus which is represented as a graph. A state-of-the-art algorithm, Personalized PageRank, is used to perform WSD. Results: When evaluated on the NLM-WSD dataset, the algorithm outperforms other methods that rely on the UMLS Metathesaurus alone.
引用
收藏
页码:2889 / 2896
页数:8
相关论文
共 33 条
[1]  
Agirre E, 2006, TEXT SPEECH LANG TEC, V33, P1, DOI 10.1007/978-1-4020-4809-8
[2]  
AGIRRE E, 2009, WORKING NOTES CROSS
[3]  
[Anonymous], 2009, Proceedings of the 12th conference of the European chapter of the Association for Computational Linguistics, DOI DOI 10.3115/1609067.1609070
[4]  
[Anonymous], 2002, Proceedings of the 11th international conference on World Wide Web, DOI DOI 10.1145/511446.511513
[5]  
[Anonymous], P IEEE INT C SEM COM
[6]  
[Anonymous], P 45 ANN M ASS COMP
[7]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[8]  
Aronson AR, 2000, J AM MED INFORM ASSN, P17
[9]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117
[10]  
CAPUTO A, 2009, P 33 INT ACM SIGIR C, P815