Constraint Based Description of Polish Multi-word Expressions

被引:0
作者
Kurc, Roman [1 ]
Piasecki, Maciej [1 ]
Broda, Bartosz [1 ]
机构
[1] Wroclaw Univ Technol, Inst Informat, PL-50370 Wroclaw, Poland
来源
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2012年
关键词
multi-word expression representation; multi-word expression recognition; morphosyntactic constraints; WCCL; plWordNet; Polish;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
We present an approach to the description of Polish Multi-word Expressions (MWEs) which is based on expressions in the WCCL language of morpho-syntactic constraints instead of grammar rules or transducers. For each MWE its basic morphological form and the base forms of its constituents are specified but also each MWE is assigned to a class on the basis of its syntactic structure. For each class a WCCL constraint is defined which is parametrised by string variables referring to MWE constituent base forms or inflected forms. The constraint specifies a minimal set of conditions that must be fulfilled in order to recognise an occurrence of the given MWE in text with high accuracy. Our formalism is focused on the efficient description of large MWE lexicons for the needs of utilisation in text processing. The formalism allows for the relatively easy representation of flexible word order and discontinuous constructions. Moreover, there is no necessity for the full specification of the MWE grammatical structure. Only some aspects of the particular MWE structure can be selected in way facilitating the target accuracy of recognition. On the basis of a set of simple heuristics, WCCL-based representation of MWEs can be automatically generated from a list of MWE base forms. The proposed representation was applied on a practical scale for the description of a large set of Polish MWEs included in plWordNet.
引用
收藏
页码:2408 / 2413
页数:6
相关论文
共 8 条
  • [1] [Anonymous], IPI PAN CORPUS PRELI
  • [2] Copestake A., 2002, MULTIWORD EXPRESSION
  • [3] JACQUEMIN C, 2001, SPOTTING DISCOVERING, V10
  • [4] Orliac B., 2003, Proceedings of Machine Translation Summit IX
  • [5] Radziszewski A, 2011, LECT NOTES ARTIF INT, V6836, P434, DOI 10.1007/978-3-642-23538-2_55
  • [6] Sag I. A., 2002, Computational Linguistics and Intelligent Text Processing. Third International Conference, CICLing 2002. Proceedings (Lecture Notes in Computer Science Vol.2276), P1
  • [7] Savary Agata, 2008, LINGUISTIC ISSUES LA, V0
  • [8] Villavicencio A., 2004, P 2 ACL WORKSH MULT, P80