Geometric Relationship between Word and Context Representations

被引:0
作者
Feng, Jiangtao [1 ]
Zheng, Xiaoqing [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
来源
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2018年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained distributed word representations have been proven to be useful in various natural language processing (NLP) tasks. However, the geometric basis of word representations and their relations to the representations of word's contexts has not been carefully studied yet. In this study, we first investigate such geometric relationship under a general framework, which is abstracted from some typical word representation learning approaches, and find out that only the directions of word representations are well associated to their context vector representations while the magnitudes are not. In order to make better use of the information contained in the magnitudes of word representations, we propose a hierarchical Gaussian model combined with maximum a posteriori estimation to learn word representations, and extend it to represent polysemous words. Our word representations have been evaluated on multiple NLP tasks, and the experimental results show that the proposed model achieved promising results, comparing to several popular word representations.
引用
收藏
页码:5102 / 5109
页数:8
相关论文
共 24 条
  • [1] [Anonymous], 2012, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
  • [2] [Anonymous], 2013, Advances in Neural Information Processing Systems
  • [3] Bengio Y, 2001, ADV NEUR IN, V13, P932
  • [4] Collobert R, 2011, J MACH LEARN RES, V12, P2493
  • [5] Erhan D, 2010, J MACH LEARN RES, V11, P625
  • [6] Placing search in context: The concept revisited
    Finkelstein, L
    Gabrilovich, E
    Matias, Y
    Rivlin, E
    Solan, Z
    Wolfman, G
    Ruppin, E
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (01) : 116 - 131
  • [7] DISTRIBUTIONAL STRUCTURE
    Harris, Zellig S.
    [J]. WORD-JOURNAL OF THE INTERNATIONAL LINGUISTIC ASSOCIATION, 1954, 10 (2-3): : 146 - 162
  • [8] Hill Felix., 2016, Computational Linguistics
  • [9] Iacobacci I, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, P95