Alignment-based extraction of multiword expressions

被引:15
作者
Caseli, Helena de Medeiros [2 ]
Ramisch, Carlos [1 ]
Volpe Nunes, Maria das Gracas [3 ]
Villavicencio, Aline [1 ,4 ]
机构
[1] Univ Fed Rio Grande do Sul, Inst Informat, Porto Alegre, RS, Brazil
[2] Univ Fed Sao Carlos, Dept Comp Sci, NILC, BR-13560 Sao Carlos, SP, Brazil
[3] Univ Sao Paulo, ICMC, NILC, Sao Carlos, SP, Brazil
[4] Univ Bath, Dept Comp Sci, Bath BA2 7AY, Avon, England
基金
巴西圣保罗研究基金会;
关键词
Automatic identification; Word alignment; Machine translation; Terminology; Multiword expressions; Lexical acquisition; Statistical methods; RESOURCES;
D O I
10.1007/s10579-009-9097-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Due to idiosyncrasies in their syntax, semantics or frequency, Multiword Expressions (MWEs) have received special attention from the NLP community, as the methods and techniques developed for the treatment of simplex words are not necessarily suitable for them. This is certainly the case for the automatic acquisition of MWEs from corpora. A lot of effort has been directed to the task of automatically identifying them, with considerable success. In this paper, we propose an approach for the identification of MWEs in a multilingual context, as a by-product of a word alignment process, that not only deals with the identification of possible MWE candidates, but also associates some multiword expressions with semantics. The results obtained indicate the feasibility and low costs in terms of tools and resources demanded by this approach, which could, for example, facilitate and speed up lexicographic work.
引用
收藏
页码:59 / 77
页数:19
相关论文
empty
未找到相关数据