Context-aware positional representation for self-attention networks q

被引:5
|
作者
Chen, Kehai [1 ]
Wang, Rui [1 ]
Utiyama, Masao [1 ]
Sumita, Eiichiro [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Kyoto, Japan
关键词
Positional representation; Context information; Self-attention networks; Machine translation;
D O I
10.1016/j.neucom.2021.04.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In self-attention networks (SANs), positional embeddings are used to model order dependencies between words in the input sentence and are added with word embeddings to gain an input representation, which enables the SAN-based neural model to perform (multi-head) and to stack (multi-layer) self-attentive functions in parallel to learn the representation of the input sentence. However, this input representation only involves static order dependencies based on discrete position indexes of words, that is, is independent of context information, which may be weak in modeling the input sentence. To address this issue, we proposed a novel positional representation method to model order dependencies based on n-gram context or sentence context in the input sentence, which allows SANs to learn a more effective sentence representation. To validate the effectiveness of the proposed method, it is applied to the neural machine translation model, which adopts a typical SAN-based neural model. Experimental results on two widely used translation tasks, i.e., WMT14 English-to-German and WMT17 Chinese-to-English, showed that the proposed approach can significantly improve the translation performance over the strong Transformer baseline. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:46 / 56
页数:11
相关论文
共 50 条
  • [21] Convolutional Self-Attention Networks
    Yang, Baosong
    Wang, Longyue
    Wong, Derek F.
    Chao, Lidia S.
    Tu, Zhaopeng
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4040 - 4045
  • [22] Context-Aware Multi-View Attention Networks for Emotion Cause Extraction
    Xiao, Xinglin
    Wei, Penghui
    Mao, Wenji
    Wang, Lei
    2019 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2019, : 128 - 133
  • [23] Context-aware attention network for image recognition
    Jiaxu Leng
    Ying Liu
    Shang Chen
    Neural Computing and Applications, 2019, 31 : 9295 - 9305
  • [24] Context-aware attention network for image recognition
    Leng, Jiaxu
    Liu, Ying
    Chen, Shang
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (12): : 9295 - 9305
  • [25] A formal representation for context-aware business processes
    Mattos, Talita da Cunha
    Santoro, Flavia Maria
    Revoredo, Kate
    Nunes, Vanessa Tavares
    COMPUTERS IN INDUSTRY, 2014, 65 (08) : 1193 - 1214
  • [26] Multiple Positional Self-Attention Network for Text Classification
    Dai, Biyun
    Li, Jinlong
    Xu, Ruoyi
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7610 - 7617
  • [27] Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
    Wu, Yihan
    Wang, Xi
    Zhang, Shaofei
    He, Lei
    Song, Ruihua
    Nie, Jian-Yun
    INTERSPEECH 2022, 2022, : 5503 - 5507
  • [28] Proactive context-aware sensor networks
    Ahn, S
    Kim, D
    WIRELESS SENSOR NETWORKS, PROCEEDINGS, 2006, 3868 : 38 - 53
  • [29] Social Networks and Context-Aware Spam
    Brown, Garrett
    Howe, Travis
    Ihbe, Micheal
    Prakash, Atul
    Borders, Kevin
    CSCW: 2008 ACM CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK, CONFERENCE PROCEEDINGS, 2008, : 403 - 412
  • [30] Context-Aware Emotion Recognition Networks
    Lee, Jiyoung
    Kim, Seungryong
    Kim, Sunok
    Park, Jungin
    Sohn, Kwanghoon
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 10142 - 10151