End-to-End Dialogue Generation Using a Single Encoder and a Decoder Cascade With a Multidimension Attention Mechanism

被引:5
|
作者
Belainine, Billal [1 ]
Sadat, Fatiha [1 ]
Boukadoum, Mounir [1 ]
机构
[1] Univ Quebec Montreal, Dept Comp Sci, Montreal, PQ H3X 2Y7, Canada
关键词
Decoding; History; Context modeling; Computer architecture; Visualization; Transformers; Predictive models; Attention mechanism; dialogue generation; hierarchical recurrent attention network (HRAN); neural machine; relevant context with self-attention (ReCoSa); sequence transduction;
D O I
10.1109/TNNLS.2022.3151347
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human dialogues often show underlying dependencies between turns, with each interlocutor influencing the queries/responses of the other. This article follows this by proposing a neural architecture for conversation modeling that looks at the dialogue history of both sides. It consists of a generative model where one encoder feeds three decoders to process three successive turns of dialogue for predicting the next utterance, with a multidimension attention mechanism aggregating the past and current contexts for a cascade effect on each decoder. As a result, a more comprehensive account of the dialogue evolution is obtained than by focusing on a single turn or the last encoder context, or on the user side alone. The response generation performance of the model is evaluated on three corpora of different sizes and topics, and a comparison is made with six recent generative neural architectures, using both automatic metrics and human judgments. Our results show that the proposed architecture equals or improves the state-of-the-art for adequacy and fluency, particularly when large open-domain corpora are used in the training. Moreover, it allows better tracking of the dialogue state evolution for response explainability.
引用
收藏
页码:8482 / 8492
页数:11
相关论文
共 50 条
  • [21] Bidirectional decoder networks for attention-based end-to-end offline handwriting recognition
    Doetsch, Patrick
    Zeyer, Albert
    Ney, Hermann
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 361 - 366
  • [22] End-to-End Speaker Age and Height Estimation using Attention Mechanism and Triplet Loss
    Kaushik, Manav
    Pham, Van Tung
    Anh, Tran The
    Chng, Eng Siong
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 786 - 793
  • [23] Memory Attention: Robust Alignment Using Gating Mechanism for End-to-End Speech Synthesis
    Lee, Joun Yeop
    Cheon, Sung Jun
    Choi, Byoung Jin
    Kim, Nam Soo
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 2004 - 2008
  • [24] Multi-Granularity Sequence Alignment Mapping for Encoder-Decoder Based End-to-End ASR
    Tang, Jian
    Zhang, Jie
    Song, Yan
    McLoughlin, Ian
    Dai, Li-Rong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2816 - 2828
  • [25] ON TRAINING THE RECURRENT NEURAL NETWORK ENCODER-DECODER FOR LARGE VOCABULARY END-TO-END SPEECH RECOGNITION
    Lu, Liang
    Zhang, Xingxing
    Renals, Steve
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5060 - 5064
  • [26] End-to-end trained encoder-decoder convolutional neural network for fetal electrocardiogram signal denoising
    Fotiadou, Eleni
    Konopczynski, Tomasz
    Hesser, Juergen
    Vullings, Rik
    PHYSIOLOGICAL MEASUREMENT, 2020, 41 (01)
  • [27] Gigapixel end-to-end training using streaming and attention
    Dooper, Stephan
    Pinckaers, Hans
    Aswolinskiy, Witali
    Hebeda, Konnie
    Jarkman, Sofia
    van der Laak, Jeroen
    Litjens, Geert
    BIGPICTURE Consortium
    MEDICAL IMAGE ANALYSIS, 2023, 88
  • [28] Single-pass end-to-end neural decompilation using copying mechanism
    Gergő Szalay
    Máté Bálint Poór
    Balázs Pintér
    Tibor Gregorics
    Neural Computing and Applications, 2025, 37 (7) : 5309 - 5323
  • [29] End-to-end residual attention mechanism for cataractous retinal image dehazing
    Qiu, Defu
    Cheng, Yuhu
    Wang, Xuesong
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 219
  • [30] End-to-end driving model based on deep learning and attention mechanism
    Zhu, Wuqiang
    Lu, Yang
    Zhang, Yongliang
    Wei, Xing
    Wei, Zhen
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (04) : 3337 - 3348