Alignment-based extraction of multiword expressions

被引:15
作者
Caseli, Helena de Medeiros [2 ]
Ramisch, Carlos [1 ]
Volpe Nunes, Maria das Gracas [3 ]
Villavicencio, Aline [1 ,4 ]
机构
[1] Univ Fed Rio Grande do Sul, Inst Informat, Porto Alegre, RS, Brazil
[2] Univ Fed Sao Carlos, Dept Comp Sci, NILC, BR-13560 Sao Carlos, SP, Brazil
[3] Univ Sao Paulo, ICMC, NILC, Sao Carlos, SP, Brazil
[4] Univ Bath, Dept Comp Sci, Bath BA2 7AY, Avon, England
基金
巴西圣保罗研究基金会;
关键词
Automatic identification; Word alignment; Machine translation; Terminology; Multiword expressions; Lexical acquisition; Statistical methods; RESOURCES;
D O I
10.1007/s10579-009-9097-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Due to idiosyncrasies in their syntax, semantics or frequency, Multiword Expressions (MWEs) have received special attention from the NLP community, as the methods and techniques developed for the treatment of simplex words are not necessarily suitable for them. This is certainly the case for the automatic acquisition of MWEs from corpora. A lot of effort has been directed to the task of automatically identifying them, with considerable success. In this paper, we propose an approach for the identification of MWEs in a multilingual context, as a by-product of a word alignment process, that not only deals with the identification of possible MWE candidates, but also associates some multiword expressions with semantics. The results obtained indicate the feasibility and low costs in terms of tools and resources demanded by this approach, which could, for example, facilitate and speed up lexicographic work.
引用
收藏
页码:59 / 77
页数:19
相关论文
共 26 条
[21]  
PEARCE D, 2002, P INT C LANG RES EV, P1
[22]  
Procter P., 1995, CAMBRIDGE INT DICT E
[23]  
Sag I. A., 2002, Computational Linguistics and Intelligent Text Processing. Third International Conference, CICLing 2002. Proceedings (Lecture Notes in Computer Science Vol.2276), P1
[24]  
Van de Cruys T., 2007, Perspective on Multiword Expressions, P25
[25]   The availability of verb-particle constructions in lexical resources: How much is enough? [J].
Villavicencio, A .
COMPUTER SPEECH AND LANGUAGE, 2005, 19 (04) :415-432
[26]  
Zhang Y., 2006, Proceedings of the 3rd international workshop on metamodels, schemas, grammars, and ontologies for reverse engineering (atem 2006) (p, P36