Construction of a Japanese Word Similarity Dataset

被引:0
作者
Sakaizawa, Yuya [1 ]
Komachi, Mamoru [1 ]
机构
[1] Tokyo Metropolitan Univ, 6-6 Asahigaoka, Hino, Tokyo 1910065, Japan
来源
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018) | 2018年
关键词
word embeddings; distributed representation; word similarity;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
An evaluation of distributed word representation is generally conducted using a word similarity task and/or a word analogy task. There are many datasets readily available for these tasks in English. However, evaluating distributed representation in languages that do not have such resources (e.g., Japanese) is difficult. Therefore, as a first step toward evaluating distributed representations in Japanese, we constructed a Japanese word similarity dataset. To the best of our knowledge, our dataset is the first resource that can be used to evaluate distributed representations in Japanese. Moreover, our dataset contains various parts of speech and includes rare words in addition to common words.
引用
收藏
页码:948 / 951
页数:4
相关论文
共 17 条
  • [1] [Anonymous], 2013, P 17 C COMPUTATIONAL, DOI DOI 10.1007/BF02579642
  • [2] Baker S., 2014, EMNLP, P278, DOI DOI 10.3115/V1/D14-1034
  • [3] Chen BX, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, P150
  • [4] Placing search in context: The concept revisited
    Finkelstein, L
    Gabrilovich, E
    Matias, Y
    Rivlin, E
    Solan, Z
    Wolfman, G
    Ruppin, E
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (01) : 116 - 131
  • [5] Gerz D., 2016, P 2016 C EMPIRICAL M, P2173
  • [6] SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation
    Hill, Felix
    Reichart, Roi
    Korhonen, Anna
    [J]. COMPUTATIONAL LINGUISTICS, 2015, 41 (04) : 665 - 695
  • [7] Huang E.H., 2012, ACL
  • [8] Ikehara S., 1997, JAPANESE LEXICON
  • [9] Isahara H, 2008, SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, P2420
  • [10] Kodaira T., 2016, P ACL 2016 STUDENT R, P1, DOI [DOI 10.18653/V1/P16-3001, 10.18653/v1/P16-3001]