Bandit Structured Prediction for Neural Sequence-to-Sequence Learning

被引:19
|
作者
Kreutzer, Julia [1 ]
Sokolov, Artem [1 ]
Riezler, Stefan [1 ,2 ]
机构
[1] Heidelberg Univ, Computat Linguist, Heidelberg, Germany
[2] Heidelberg Univ, IWR, Heidelberg, Germany
来源
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1 | 2017年
关键词
D O I
10.18653/v1/P17-1138
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback. This feedback is received in the form of a task loss evaluation to a predicted output structure, without having access to gold standard structures. We advance this framework by lifting linear bandit learning to neural sequence-to-sequence learning problems using attention-based recurrent neural networks. Furthermore, we show how to incorporate control variates into our learning algorithms for variance reduction and improved generalization. We present an evaluation on a neural machine translation task that shows improvements of up to 5.89 BLEU points for domain adaptation from simulated bandit feedback.
引用
收藏
页码:1503 / 1513
页数:11
相关论文
共 50 条
  • [1] Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning
    Kreutzer, Julia
    Uyheng, Joshua
    Riezler, Stefan
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1777 - 1788
  • [2] Sequence-to-Sequence Learning with Latent Neural Grammars
    Kim, Yoon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Sequence-to-Sequence Video Prediction by Learning Hierarchical Representations
    Fan, Kun
    Joung, Chungin
    Baek, Seungjun
    APPLIED SCIENCES-BASEL, 2020, 10 (22): : 1 - 14
  • [4] Sequence-to-Sequence Learning for Prediction of Soil Temperature and Moisture
    Li, Xiaojie
    Tang, Jian
    Yin, Chengxiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [5] Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
    Liu, Bowen
    Ramsundar, Bharath
    Kawthekar, Prasad
    Shi, Jade
    Gomes, Joseph
    Quang Luu Nguyen
    Ho, Stephen
    Sloane, Jack
    Wender, Paul
    Pande, Vijay
    ACS CENTRAL SCIENCE, 2017, 3 (10) : 1103 - 1113
  • [6] Sequence-to-sequence Prediction of Personal Computer Software by Recurrent Neural Network
    Yang, Qichuan
    He, Zhiqiang
    Ge, Fujiang
    Zhang, Yang
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 934 - 940
  • [7] Dictionary Augmented Sequence-to-Sequence Neural Network for Grapheme to Phoneme prediction
    Bruguier, Antoine
    Bakhtin, Anton
    Sharma, Dravyansh
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3733 - 3737
  • [8] Deep Sequence-to-Sequence Neural Networks for Ionospheric Activity Map Prediction
    Cherrier, Noelie
    Castaings, Thibaut
    Boulch, Alexandre
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 545 - 555
  • [9] Sequence-to-sequence prediction of spatiotemporal systems
    Shen, Guorui
    Kurths, Juergen
    Yuan, Ye
    CHAOS, 2020, 30 (02)
  • [10] Semantic Matching for Sequence-to-Sequence Learning
    Zhang, Ruiyi
    Chen, Changyou
    Zhang, Xinyuan
    Bai, Ke
    Carin, Lawrence
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 212 - 222