Standard Co-training in Multiword Expression Detection

被引:0
作者
Metin, Senem Kumova [1 ]
机构
[1] Izmir Univ Econ, Fac Engn, Dept Software Engn, Sakarya Caddesi 156, Izmir, Turkey
来源
INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2017 | 2017年 / 10688卷
关键词
Multiword expression; Classification; Co-training;
D O I
10.1007/978-3-319-72038-8_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiword expressions (MWEs) are units in language where multiple words unite without an obvious/known reason. Since MWEs occupy a prominent amount of space in both written and spoken language materials, identification of MWEs is accepted to be an important task in natural language processing. In this paper, considering MWE detection as a binary classification task, we propose to use a semi-supervised learning algorithm, standard co-training [1] Co-training is a semi-supervised method that employs two classifiers with two different views to label unlabeled data iteratively in order to enlarge the training sets of limited size. In our experiments, linguistic and statistical features that distinguish MWEs from random word combinations are utilized as two different views. Two different pairs of classifiers are employed with a group of experimental settings. The tests are performed on a Turkish MWE data set of 3946 positive and 4230 negative MWE candidates. The results showed that the classifier where statistical view is considered succeeds in MWE detection when the training set is enlarged by co-training.
引用
收藏
页码:178 / 188
页数:11
相关论文
共 25 条
[1]  
[Anonymous], 1998, Advances in Kernel Methods-Support Vector Learning
[2]  
Belkin M, 2002, NIPS 2002, P271
[3]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[4]  
He S, 2006, SELF TRAINING COTRAI, P13
[5]  
Kiritchenko S., 2001, Proc. Conference of the Centre for Advanced Studies on Collaborative Research, P301
[6]  
Kumova Metin S, 2017, 25 SIGN PROC COMM AP, P1
[7]  
Metin S. K, 2016, J SOC SCI TURKIC SUM, P253
[8]  
Metin SK, 2016, COMPUT SYST SCI ENG, V31, P209
[9]  
Metin SK, 2010, LECT NOTES ARTIF INT, V6233, P238, DOI 10.1007/978-3-642-14770-8_27
[10]  
Mihalcea R, 2003, LANG LEARN, P182