Multi-source inverse-curriculum-based training for low-resource dialogue generation

被引：0

作者：

Fuwei Cui

Hui Di

Hui Huang

Hongjie Ren

Kazushige Ouchi

Ze Liu

Jinan Xu

机构：

[1] Beijing Jiaotong University,Institute of Advanced Control System, School of Electronic Information Engineering

[2] Toshiba (China) Co.,School of Computer Information Technology

[3] Ltd,undefined

[4] Beijing Jiaotong University,undefined

来源：

Applied Intelligence | 2023年 / 53卷

关键词：

Dialogue generation; Low-resource dialogue generation; Data augmentation; Curriculum learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

An effective dialogue system needs amount of training data, but the existing training data is insufficient. Although the pre-trained model has made great progress in recent years, which can alleviate the problem of low resource dialogue to a certain extent, the pre-trained model is large and difficult to deploy. How to improve the performance of dialogue model without additional annotation data and decreasing the model volume has become a new challenge. We propose a multi-source data augmentation method for low-resource dialogue generation by utilizing inverse curriculum learning (inverse CL). Firstly, we adopt three data augmentation methods, including round-trip translation, paraphrasing and pre-trained model, to generate augmentation data. Next, we propose a new training strategy based on inverse CL to utilize different augmentation data. Comparing with the baselines, our method comprehensively outperform the baselines on all evaluation metrics, which shows the effectiveness of our proposed training strategy for dialogue generation. To the best of our knowledge, this is the first systematic investigation of data augmentation in the dialogue generation.

引用

页码：13665 / 13676

页数：11

共 50 条

[1] Multi-source inverse-curriculum-based training for low-resource dialogue generation
Cui, Fuwei
Di, Hui
Huang, Hui
Ren, Hongjie
Ouchi, Kazushige
Liu, Ze
Xu, Jinan
APPLIED INTELLIGENCE, 2023, 53 (11) : 13665 - 13676
[2] Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining
Zou, Yicheng
Zhu, Bolin
Hu, Xingwu
Gui, Tao
Zhang, Qi
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 80 - 91
[3] A Unified Data Augmentation Framework for Low-Resource Multi-domain Dialogue Generation
Liu, Yongkang
Nie, Ercong
Feng, Shi
Hua, Zheng
Ding, Zifeng
Wang, Daling
Zhang, Yifei
Schuetze, Hinrich
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT II, ECML PKDD 2024, 2024, 14942 : 162 - 177
[4] Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
Choi, Gyu-Hyeon
Shin, Jong-Hun
Kim, Young-Kil
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 900 - 904
[5] State Value Generation with Prompt Learning and Self-Training for Low-Resource Dialogue State Tracking
Gu, Ming
Yang, Yan
Chen, Chengcai
Yu, Zhou
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
[6] A Stack-Propagation Framework for Low-Resource Personalized Dialogue Generation
Song, Haoyu
Zhang, Wei-Nan
Zhang, Kaiyan
Liu, Ting
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)
[7] Multi-Source Multi-Type Knowledge Exploration and Exploitation for Dialogue Generation
Ni, Xuanfan
Dai, Hongliang
Ren, Zhaochun
Li, Piji
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12522 - 12537
[8] Variational model for low-resource natural language generation in spoken dialogue systems
Tran, Van-Khanh
Nguyen, Le-Minh
Computer Speech and Language, 2021, 65
[9] Variational model for low-resource natural language generation in spoken dialogue systems
Van-Khanh Tran
Le-Minh Nguyen
COMPUTER SPEECH AND LANGUAGE, 2021, 65
[10] Towards Low-Resource Semi-Supervised Dialogue Generation with Meta-Learning
Huang, Yi
Feng, Junlan
Ma, Shuo
Du, Xiaoyu
Wu, Xiaoting
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4123 - 4128

← 1 2 3 4 5 →