Self-organizing semantic maps and its application to word alignment in Japanese-Chinese parallel corpora

被引:1
|
作者
Ma, Q [1 ]
Kanzaki, K
Zhang, YJ
Murata, M
Isahara, H
机构
[1] Ryukoku Univ, Fac Sci & Technol, Dept Appl Math & Informat, Seta, Otsu 5202194, Japan
[2] Natl Inst Informat & Commun Technol, Keihanna Human Info Commun Res Ctr, Kyoto 6190289, Japan
关键词
semantic map; word alignment; corpus; parallel corpus; Japanese; Chinese; monolingual; bilingual; SOM;
D O I
10.1016/j.neunet.2004.07.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a method involving self-organizing monolingual semantic maps that are visible and continuous representations where Chinese or Japanese words with similar meanings are placed at the same or neighboring points so that the distance between them represents the semantic similarity. We used the self-organizing map, SOM, as a self-organizing device. The words to be self-organized are defined by sets of co-occurring words collected from Chinese or Japanese newspapers, according to their grammatical relationships. The words are then coded into vectors to be forwarded to the SOM, taking into account the semantic correlation between them, which is established using a form of word-similarity computation. The self-organized monolingual semantic maps are assessed by numerical evaluations of accuracy, recall, and the F-measure. as well as by intuition, and by the comparisons with a clustering method and with multivariate statistical analysis. This paper further discusses the possibility that the method we propose can be extended to constructing Japanese-Chinese bilingual semantic maps, with the aim of providing a semantics-based approach to word alignment in Japanese-Chinese parallel corpora. We also show the effectiveness of this extended method through small-scale comparative experiments with a baseline method, where the alignment of Japanese and Chinese words is directly determined through the Euclidean distance of vectors representing the words, with a clustering method, and with multivariate statistical analysis. (C) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1241 / 1253
页数:13
相关论文
empty
未找到相关数据