SEQ2SEQ++: A Multitasking-Based Seq2seq Model to Generate Meaningful and Relevant Answers

被引:6
|
作者
Palasundram, Kulothunkan [1 ]
Sharef, Nurfadhlina Mohd [1 ]
Kasmiran, Khairul Azhar [1 ]
Azman, Azreen [1 ]
机构
[1] Univ Putra Malaysia, Fac Comp Sci & Informat Technol, Intelligent Comp Res Grp, Seri Kembangan 43400, Selangor, Malaysia
来源
IEEE ACCESS | 2021年 / 9卷 / 09期
关键词
Task analysis; Chatbots; Computational modeling; Decoding; Training; Transformers; Benchmark testing; Sequence to sequence learning; natural answer generation; multitask learning; attention mechanism; ATTENTION; ENCODER;
D O I
10.1109/ACCESS.2021.3133495
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Question-answering chatbots have tremendous potential to complement humans in various fields. They are implemented using either rule-based or machine learning-based systems. Unlike the former, machine learning-based chatbots are more scalable. Sequence-to-sequence (Seq2Seq) learning is one of the most popular approaches in machine learning-based chatbots and has shown remarkable progress since its introduction in 2014. However, chatbots based on Seq2Seq learning show a weakness in that it tends to generate answers that can be generic and inconsistent with the questions, thereby becoming meaningless and, therefore, may lower the chatbot adoption rate. This weakness can be attributed to three issues: question encoder overfit, answer generation overfit, and language model influence. Several recent methods utilize multitask learning (MTL) to address this weakness. However, the existing MTL models show very little improvement over single-task learning, wherein they still generate generic and inconsistent answers. This paper presents a novel approach to MTL for the Seq2Seq learning model called SEQ2SEQ++, which comprises a multifunctional encoder, an answer decoder, an answer encoder, and a ternary classifier. Additionally, SEQ2SEQ++ utilizes a dynamic tasks loss weight mechanism for MTL loss calculation and a novel attention mechanism called the comprehensive attention mechanism. Experiments on NarrativeQA and SQuAD datasets were conducted to gauge the performance of the proposed model in comparison with two recently proposed models. The experimental results show that SEQ2SEQ++ yields noteworthy improvements over the two models on bilingual evaluation understudy, word error rate, and Distinct-2 metrics.
引用
收藏
页码:164949 / 164975
页数:27
相关论文
共 50 条
  • [31] Seq2seq Fingerprint with Byte-Pair Encoding for Predicting Changes in Protein Stability upon Single Point Mutation
    Kawano, Keisuke
    Koide, Satoshi
    Imamura, Chie
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (05) : 1762 - 1772
  • [32] MTrajRec: Map-Constrained Trajectory Recovery via Seq2Seq Multi-task Learning
    Ren, Huimin
    Ruan, Sijie
    Li, Yanhua
    Bao, Jie
    Meng, Chuishi
    Li, Ruiyuan
    Zheng, Yu
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1410 - 1419
  • [33] Spatial-temporal attention-based seq2seq framework for short-term travel time prediction
    Zhang, Ningqing
    Wang, Fei
    Chen, Xiong
    Zhao, Tong
    Kang, Qi
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2022, 20 (01) : 23 - 37
  • [34] Short-term power load forecasting based on Seq2Seq model integrating Bayesian optimization, temporal convolutional network and attention
    Dai, Yeming
    Yu, Weijie
    APPLIED SOFT COMPUTING, 2024, 166
  • [35] Versatile and high-accuracy seq2seq model with self-attention mechanism for Li-ion battery characterization
    Yao, Li
    Pu, Shenghua
    Wang, Jian
    JOURNAL OF ENERGY STORAGE, 2024, 79
  • [36] Mobility-Aware Deep Reinforcement Learning With Seq2seq Mobility Prediction for Offloading and Allocation in Edge Computing
    Wu, Chao-Lun
    Chiu, Te-Chuan
    Wang, Chih-Yu
    Pang, Ai-Chun
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (06) : 6803 - 6819
  • [37] P2LHAP: Wearable-Sensor-Based Human Activity Recognition, Segmentation, and Forecast Through Patch-to-Label Seq2Seq Transformer
    Li, Shuangjian
    Zhu, Tao
    Nie, Mingxing
    Ning, Huansheng
    Liu, Zhenyu
    Chen, Liming
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (06): : 6818 - 6830
  • [38] A Novel Long Short-Term Memory Seq2Seq Model with Chaos-Based Optimization and Attention Mechanism for Enhanced Dam Deformation Prediction
    Wang, Lei
    Wang, Jiajun
    Tong, Dawei
    Wang, Xiaoling
    BUILDINGS, 2024, 14 (11)
  • [39] Multivariate probabilistic prediction of dam displacement behaviour using extended Seq2Seq learning and adaptive kernel density estimation
    Li, Minghao
    Ren, Qiubing
    Li, Mingchao
    Qi, Zhiyong
    Tan, Dawen
    Wang, Hai
    ADVANCED ENGINEERING INFORMATICS, 2025, 65
  • [40] A Semantic-Embedding Model-Driven Seq2Seq Method for Domain-Oriented Entity Linking on Resource-Restricted Devices
    Inan, Emrah
    Dikenelli, Oguz
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2021, 17 (03) : 73 - 87