A method of representing a multi-dimensional space for word concepts

被引:0
作者
Kasahara, Kaname [1 ]
Inago, Nozomu [1 ]
Kato, Tsuneaki [2 ]
机构
[1] NTT Commun. Science Laboratories, Nippon Telegraph and Tel. Corp.
[2] University of Tokyo, Graduate School of Arts and Science
关键词
Data acquisition - Glossaries - Information management - Information retrieval - Semantics - Thesauri;
D O I
10.1527/tjsai.17.539
中图分类号
学科分类号
摘要
There have been several previous studies on measuring the semanticsimilarity between words whose concepts are represented as points in amulti-dimensional vector space acquired from text data such aselectronic dictionaries or text corpora. A central problem in thesestudies is how to select orthonormal basis vectors for the space whichrepresents attributes of the words. We propose a method of buildingthe space by combining two representative methods, one using singularvalue decomposition and the other using the contents of a thesaurus.The proposed method was evaluated both for the purposes of similarword retrieval and for document retrieval. The evaluations showedthat the proposed combination is more effective than either of theoriginal methods alone for both of these tasks.
引用
收藏
页码:539 / 547
相关论文
empty
未找到相关数据