SEQ2SEQ++: A Multitasking-Based Seq2seq Model to Generate Meaningful and Relevant Answers

被引:6
|
作者
Palasundram, Kulothunkan [1 ]
Sharef, Nurfadhlina Mohd [1 ]
Kasmiran, Khairul Azhar [1 ]
Azman, Azreen [1 ]
机构
[1] Univ Putra Malaysia, Fac Comp Sci & Informat Technol, Intelligent Comp Res Grp, Seri Kembangan 43400, Selangor, Malaysia
来源
IEEE ACCESS | 2021年 / 9卷 / 09期
关键词
Task analysis; Chatbots; Computational modeling; Decoding; Training; Transformers; Benchmark testing; Sequence to sequence learning; natural answer generation; multitask learning; attention mechanism; ATTENTION; ENCODER;
D O I
10.1109/ACCESS.2021.3133495
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Question-answering chatbots have tremendous potential to complement humans in various fields. They are implemented using either rule-based or machine learning-based systems. Unlike the former, machine learning-based chatbots are more scalable. Sequence-to-sequence (Seq2Seq) learning is one of the most popular approaches in machine learning-based chatbots and has shown remarkable progress since its introduction in 2014. However, chatbots based on Seq2Seq learning show a weakness in that it tends to generate answers that can be generic and inconsistent with the questions, thereby becoming meaningless and, therefore, may lower the chatbot adoption rate. This weakness can be attributed to three issues: question encoder overfit, answer generation overfit, and language model influence. Several recent methods utilize multitask learning (MTL) to address this weakness. However, the existing MTL models show very little improvement over single-task learning, wherein they still generate generic and inconsistent answers. This paper presents a novel approach to MTL for the Seq2Seq learning model called SEQ2SEQ++, which comprises a multifunctional encoder, an answer decoder, an answer encoder, and a ternary classifier. Additionally, SEQ2SEQ++ utilizes a dynamic tasks loss weight mechanism for MTL loss calculation and a novel attention mechanism called the comprehensive attention mechanism. Experiments on NarrativeQA and SQuAD datasets were conducted to gauge the performance of the proposed model in comparison with two recently proposed models. The experimental results show that SEQ2SEQ++ yields noteworthy improvements over the two models on bilingual evaluation understudy, word error rate, and Distinct-2 metrics.
引用
收藏
页码:164949 / 164975
页数:27
相关论文
共 50 条
  • [21] Multi-source Seq2seq guided by knowledge for Chinese healthcare consultation
    Li, Yanghui
    Wen, Guihua
    Hu, Yang
    Luo, Mingnan
    Fan, Baochao
    Wang, Changjun
    Yang, Pei
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 117
  • [22] Aspect Sentiment Triplet Extraction: A Seq2Seq Approach With Span Copy Enhanced Dual Decoder
    Zhang, Zhihao
    Zuo, Yuan
    Wu, Junjie
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2729 - 2742
  • [23] Automatic Conversational Helpdesk Solution using Seq2Seq and Slot-filling Models
    Patidar, Mayur
    Agarwal, Puneet
    Vig, Lovekesh
    Shroff, Gautam
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1967 - 1975
  • [24] FM-GRU: A Time Series Prediction Method for Water Quality Based on seq2seq Framework
    Xu, Jianlong
    Wang, Kun
    Lin, Che
    Xiao, Lianghong
    Huang, Xingshan
    Zhang, Yufeng
    WATER, 2021, 13 (08)
  • [25] Network attack detection and visual payload labeling technology based on Seq2Seq architecture with attention mechanism
    Shi, Fan
    Zhu, Pengcheng
    Zhou, Xiangyu
    Yuan, Bintao
    Fang, Yong
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2020, 16 (04):
  • [26] From Code to Natural Language: Type-Aware Sketch-Based Seq2Seq Learning
    Deng, Yuhang
    Huang, Hao
    Chen, Xu
    Liu, Zuopeng
    Wu, Sai
    Xuan, Jifeng
    Li, Zongpeng
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 352 - 368
  • [27] Forecasting top oil temperature for UHV reactor using Seq2Seq model with convolutional block attention mechanism
    Jiang, Hao
    Zhang, Hongwei
    Chen, Jing
    Xiao, Sa
    Miao, Xiren
    Lin, Weiqing
    INTERNATIONAL JOURNAL OF APPLIED ELECTROMAGNETICS AND MECHANICS, 2023, 73 (04) : 283 - 302
  • [28] Real-time prediction of logging parameters during the drilling process using an attention-based Seq2Seq model
    Zhang, Rui
    Zhang, Chengkai
    Song, Xianzhi
    Li, Zukui
    Su, Yinao
    Li, Gensheng
    Zhu, Zhaopeng
    GEOENERGY SCIENCE AND ENGINEERING, 2024, 233
  • [29] A new seq2seq architecture for hourly runoff prediction using historical rainfall and runoff as input
    Gao, Shuai
    Zhang, Shuo
    Huang, Yuefei
    Han, Jingcheng
    Luo, Huoqian
    Zhang, Ying
    Wang, Guangqian
    JOURNAL OF HYDROLOGY, 2022, 612
  • [30] Anomaly-based Web Attack Detection: The Application of Deep Neural Network Seq2Seq With Attention Mechanism
    Mohammadi, Shahriar
    Namadchian, Amin
    ISECURE-ISC INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2020, 12 (01): : 44 - 54