Unsupervised word sense disambiguation and rules extraction using non-aligned bilingual corpus

被引:0
作者
Oliveira, F [1 ]
Wong, F [1 ]
Li, YP [1 ]
Zheng, J [1 ]
机构
[1] Univ Macau, Fac Sci & Technol, Macao, Peoples R China
来源
Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05) | 2005年
关键词
word sense disambiguation; natural language processing; machine translation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This. paper presents a statistical Word Sense Disambiguation with application in Portuguese-Chinese Machine Translation systems.. Due to the limited availability of Portuguese-Chinese resources in the form of digital corpora and annotated Treebank, an unsupervised learning and a non-aligned bilingual corpus are applied. The proposed method first identifies words related to each of the ambiguous words based on their surrounding words and relative distance. A mathematical model is then applied in the identification of the most suitable sense of an ambiguous word in terms of the related words. All the senses discovered are converted into a set of rules and stored in the Sense Knowledge base for later use in disambiguation and translation process. Preliminary experiment results show an improvement of 6% in assigning correctly the corresponding translation over the baseline method.
引用
收藏
页码:30 / 35
页数:6
相关论文
共 50 条
[21]   An unsupervised & statistical word sense tagging using bilingual sources [J].
Oliveira, F ;
Wong, F ;
Li, YP .
Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, :3749-3754
[22]   Word Sense Disambiguation applied to Assamese-Hindi Bilingual Statistical Machine Translation [J].
Barman, Anup Kumar ;
Sarmah, Jumi ;
Basimatary, Subungshri ;
Nag, Amitava .
ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2024, 14 (01) :12581-12586
[23]   Research of Word Sense Disambiguation Based on Mining Association Rules [J].
Sun, Yong-le ;
Jia, Ke-liang .
IITAW: 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATIONS WORKSHOPS, 2009, :86-+
[24]   Word Sense Disambiguation by Information Filtering and Extraction [J].
Jeremy Ellman ;
Ian Klincke ;
John Tait .
Computers and the Humanities, 2000, 34 :127-134
[25]   Word Sense Disambiguation Features for Taxonomy Extraction [J].
Alexeyevsky, Daniil .
COMPUTACION Y SISTEMAS, 2018, 22 (03) :871-880
[26]   Word sense disambiguation by information filtering and extraction [J].
Ellman, J ;
Klincke, I ;
Tait, J .
COMPUTERS AND THE HUMANITIES, 2000, 34 (1-2) :127-134
[27]   Exploiting Rules for Word Sense Disambiguation in Machine Translation [J].
Specia, Lucia ;
Nunes, Maria das Gracas V. ;
Stevenson, Mark .
PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (35) :171-178
[28]   WORD SENSE DISAMBIGUATION USING WORD ONTOLOGY AND CONCEPT DISTRIBUTION [J].
Hung, Jason C. ;
Yang, Che-Yu .
JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2009, 32 (02) :153-168
[29]   Unsupervised Hindi Word Sense Disambiguation based on Network Agglomeration [J].
Jain, Amita ;
Lobiyal, D. K. .
2015 2ND INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2015, :195-200
[30]   Unsupervised word sense disambiguation with N-gram features [J].
Preotiuc-Pietro, Daniel ;
Hristea, Florentina .
ARTIFICIAL INTELLIGENCE REVIEW, 2014, 41 (02) :241-260