State of the art versus classical clustering for unsupervised word sense disambiguation

被引:8
作者
Popescu, Marius
Hristea, Florentina
机构
[1] C.P. 010014 Bucharest, Academiei 14, Str.
关键词
Word sense disambiguation; Unsupervised disambiguation; Bayesian classification; The EM algorithm; WordNet; Spectral clustering; CORPUS;
D O I
10.1007/s10462-010-9193-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper ultimately discusses the importance of the clustering method used in unsupervised word sense disambiguation. It illustrates the fact that a powerful clustering technique can make up for lack of external knowledge of all types. It argues that feature selection does not always improve disambiguation results, especially when using an advanced, state of the art method, hereby exemplified by spectral clustering. Disambiguation results obtained when using spectral clustering in the case of the main parts of speech (nouns, adjectives, verbs) are compared to those of the classical clustering method given by the Na < ve Bayes model. In the case of unsupervised word sense disambiguation with an underlying Na < ve Bayes model feature selection performed in two completely different ways is surveyed. The type of feature selection providing the best results (WordNet-based feature selection) is equally being used in the case of spectral clustering. The conclusion is that spectral clustering without feature selection (but using its own feature weighting) produces superior disambiguation results in the case of all parts of speech.
引用
收藏
页码:241 / 264
页数:24
相关论文
共 50 条
[21]   Feeding Syntactic Versus Semantic Knowledge to a Knowledge-lean Unsupervised Word Sense Disambiguation Algorithm with an Underlying Naive Bayes Model [J].
Hristea, Florentina ;
Colhon, Mihaela .
FUNDAMENTA INFORMATICAE, 2012, 119 (01) :61-86
[22]   Echo State Network for Word Sense Disambiguation [J].
Koprinkova-Hristova, Petia ;
Popov, Alexander ;
Simov, Kiril ;
Osenova, Petya .
ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2018, 2018, 11089 :73-82
[23]   Practice of Word Sense Disambiguation [J].
Sieminski, Andrzej .
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2018, PT I, 2018, 10751 :159-169
[24]   Word Sense Disambiguation for Assamese [J].
Sarmah, Jumi ;
Sarma, Shikhar Kr .
2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, :146-151
[25]   Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance [J].
Abdalgader, Khaled ;
Skabar, Andrew .
ACM Transactions on Speech and Language Processing, 2012, 9 (01)
[26]   Word Sense Disambiguation in Bengali language using unsupervised methodology with modifications [J].
Alok Ranjan Pal ;
Diganta Saha .
Sādhanā, 2019, 44
[27]   Combining Supervised and Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation [J].
E. Agirre ;
G. Rigau ;
L. Padró ;
J. Atserias .
Computers and the Humanities, 2000, 34 :103-108
[28]   Unsupervised word-sense disambiguation using bilingual comparable corpora [J].
Kaji, H ;
Morimoto, Y .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (02) :289-301
[29]   Unsupervised Translated Word Sense Disambiguation in Constructing Bilingual Lexical Database [J].
Lynn, Htet Myet ;
Choi, Chang ;
Kim, Pankoo .
33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, :1824-1827
[30]   Word Sense Disambiguation in Bengali language using unsupervised methodology with modifications [J].
Pal, Alok Ranjan ;
Saha, Diganta .
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 44 (07)