Selecting Training Data for Unsupervised Domain Adaptation in Word Sense Disambiguation

被引:0
作者
Komiya, Kanako [1 ]
Sasaki, Minoru [1 ]
Shinnou, Hiroyuki [1 ]
Kotani, Yoshiyuki [2 ]
Okumura, Manabu [3 ]
机构
[1] Ibaraki Univ, 4-12-1 Nakanarusawa, Hitachi, Ibaraki 3168511, Japan
[2] Tokyo Univ Agr & Thechnol, 2-24-16 Naka Cho, Koganei, Tokyo 1848588, Japan
[3] Tokyo Inst Technol, Midori Ku, 4259 Nagatuta, Yokohama, Kanagawa 2268503, Japan
来源
PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE | 2016年 / 9810卷
关键词
Domain adaptation; Word sense disambiguation; Data selection;
D O I
10.1007/978-3-319-42911-3_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a method of domain adaptation, which involves adapting a classifier developed from source to target data. We automatically select the training data set that is suitable for the target data from the whole source data of multiple domains. This is unsupervised domain adaptation for Japanese word sense disambiguation (WSD). Experiments revealed that the accuracies of WSD improved when we automatically selected the training data set using two criteria, the degree of confidence and the leave-one-out (LOO)-bound score, compared with when the classifier was trained with all the data.
引用
收藏
页码:220 / 232
页数:13
相关论文
共 26 条
  • [1] Agirre Eneko, 2009, P 12 C EUR CHAPT ASS, P42
  • [2] Agirre Eneko, 2008, P 22 INT C COMP LING, P17
  • [3] Blitzer R., 2006, P 2006 C EMP METH NA, P120, DOI DOI 10.3115/1610075.1610094
  • [4] Chan YS, 2006, COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, P89
  • [5] Chan Yee Seng., 2007, Computational Linguistics, P49
  • [6] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [7] Daume III H., 2007, P 45 ANN M ASS COMP, P256
  • [8] Daume III H, 2010, P 2010 WORKSH DOM AD, P23
  • [9] Hasida K., 1998, P 1 INT C LANG RES E, P457
  • [10] Jiang J., 2007, P ACL, P264