Identification of Nominal Multiword Expressions in Bengali Using CRF

被引:0
作者
Chakraborty, Tanmoy [1 ]
机构
[1] Indian Inst Technol, Dept Comp Sci & Engn, Kharagpur 721302, W Bengal, India
来源
4TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2012) | 2012年
关键词
Multiword Expressions; Bengali; CRF; Reduplications;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
One of the key issues in both natural language understanding and generation is the appropriate processing of Multiword Expressions (MWEs). MWEs pose a huge problem to a precise language processing due to their idiosyncratic nature and diversity in lexical, syntactical and semantic properties. The semantic of a MWE can be expressed transparently or opaquely after combining the semantic of its constituents. This paper deals with the identification of Nominal Multiword Expressions in the Bengali text using Conditional Random Field (CRF) machine learning technique. Bengali is highly agglutinative and morphologically rich language. Thus the selection of features such as surrounding words, POS tag, prefix, suffix, length etc are proved to be very effective for running the CRF tool for the identification of Nominal MWEs. Compared to the statistical system built in Bengali language for compound noun MWEs identification, our proposed system shows higher accuracy in terms of precision, recall and F-score. We also conclude that with the identification of Reduplicated MWEs (RMWEs) and considering it as a feature makes reasonable improvement compared to the earlier system.
引用
收藏
页数:6
相关论文
共 17 条
[1]  
Agarwal Ashwini., 2004, INT C NATURAL LANGUA, P165
[2]  
Bandyopadhyay S., 2010, P 23 INT C COMP LING, P72
[3]  
Chakraborty T., 2010, 8 INT C NAT LANG PRO
[4]  
Chakraborty T., 2011, P DISTR SEM COMP DIS, P38
[5]  
Chakraborty T., 2011, P MULT EXPR PARS GEN, P8
[6]  
Church K.W., 1990, WORD ASS NORMS MUTUA, V16, P22
[7]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[8]  
Das D., 2010, P MULT EXPR THEOR AP, P37
[9]  
Dunning T., 1993, Computational Linguistics, V19, P61
[10]  
Ekbal, 2008, Advances in Natural Language Processing and Applications, Research in Computing Science (RCS) Journal, V33, P67