Word Sense Disambiguation Using Clustered Sense Labels

被引:4
作者
Park, Jeong Yeon [1 ]
Shin, Hyeong Jin [1 ]
Lee, Jae Sung [1 ]
机构
[1] Chungbuk Natl Univ, Dept Comp Sci, Cheongju 28644, South Korea
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 04期
基金
新加坡国家研究基金会;
关键词
word sense disambiguation; clustering; deep learning; sense vocabulary;
D O I
10.3390/app12041857
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Sequence labeling models for word sense disambiguation have proven highly effective when the sense vocabulary is compressed based on the thesaurus hierarchy. In this paper, we propose a method for compressing the sense vocabulary without using a thesaurus. For this, sense definitions in a dictionary are converted into sentence vectors and clustered into the compressed senses. First, the very large set of sense vectors is partitioned for less computational complexity, and then it is clustered hierarchically with awareness of homographs. The experiment was done on the English Senseval and Semeval datasets and the Korean Sejong sense annotated corpus. This process demonstrated that the performance greatly increased compared to that of the uncompressed sense model and is comparable to that of the thesaurus-based model.
引用
收藏
页数:11
相关论文
共 42 条
[1]  
Aesun Yoon, 2012, [HAN-GEUL, 한글], V295, P163
[2]  
Agirre Eneko., 2006, Proceedings of the Conference on Empirical Methods in Natural Language Processing, P585
[3]  
Bae Y.j., 2014, P 26 ANN C HUM COGN, P27
[4]  
Basile P., 2014, P COLING 2014 25 INT, P1591
[5]  
Bevilacqua Michele, 2021, P 30 INT JOINT C ART
[6]   Trends in word sense disambiguation [J].
Bhala, R. V. Vidhu ;
Abirami, S. .
ARTIFICIAL INTELLIGENCE REVIEW, 2014, 42 (02) :159-171
[7]  
Bjerva Johannes, 2017, P 21 NORD C COMP LIN, P216
[8]  
Borah P. P., 2014, Int. J. Recent Technol. Eng., V3, P35
[9]  
Cer D, 2018, CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P169
[10]  
Choi K.S., 2004, P INT C LANG RES EV, P1131