MODE: a multimodal open-domain dialogue dataset with explanation

被引:0
|
作者
Yin, Hang [1 ]
Lu, Pinren [1 ]
Li, Ziang [1 ]
Sun, Bin [1 ]
Li, Kan [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, 5 South St, Beijing 100081, Peoples R China
基金
北京市自然科学基金;
关键词
Multimodal data construction; Open-domain dialogue; AIGC; Explainability;
D O I
10.1007/s10489-024-05479-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The need for high-quality data has been a key issue hindering the research of dialogue tasks. Recent studies try to build datasets through manual, web crawling and so on. However, man-made data is expensive and data collected from the internet often includes generic responses, meaningless statements even toxic information. With the development of LLM (large language models), generating data through LLM has broad application potential. For open-domain multimodal dialogue tasks, there are still three drawbacks: 1) There is currently a lack of a unified and effective framework for collecting high-quality multimodal dialogue data; 2) The output of LLM in Multimodal dialogue generation lacks scene explanation, affecting human understanding; 3) Previous work has not quantitatively examined the impact of data quality on model performance. To improve data quality and reduce expenditure in the data collection process, we propose the Multimodal Data Construction Framework (MDCF). MDCF utilizes the modal conversion module and designs proper prompts to the LLM to generate well-formed and high-quality content. It also provides explanation for the multimodal dialogue, helping to understand conversation scenarios and facilitate manual subsequent quality inspection. Based on this, we release a Multimodal Open-domain Dialogue dataset with Explanation(MODE). We mainly compared open domain datasets such as Image-Chat. Both human evaluation and experiments show that high-quality datasets enable models to have greater understanding and generation capabilities.
引用
收藏
页码:5891 / 5906
页数:16
相关论文
共 50 条
  • [1] DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment
    Zhao, Yukun
    Yang, Lingyong
    Sun, Weiwei
    Meng, Chong
    Wang, Shuaiqiang
    Cheng, Zhicong
    Ren, Zhaochun
    Yin, Dawei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15128 - 15145
  • [2] MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing
    Gong, Chen
    Kong, Dexin
    Zhao, Suxian
    Li, Xingyu
    Fu, Guohong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 10561 - 10573
  • [3] Multi-Modal Open-Domain Dialogue
    Shuster, Kurt
    Smith, Eric Michael
    Ju, Da
    Weston, Jason
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 4863 - 4883
  • [4] Adversarial Evaluation for Open-Domain Dialogue Generation
    Bruni, Elia
    Fernandez, Raquel
    18TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2017), 2017, : 284 - 288
  • [5] On the Compositional Generalization in Versatile Open-domain Dialogue
    Fu, Tingchen
    Zhao, Xueliang
    Liu, Lemao
    Yan, Rui
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13585 - 13605
  • [6] Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data
    Zhang, Rongsheng
    Zheng, Yinhe
    Shao, Jianzhi
    Mao, Xiaoxi
    Xi, Yadong
    Huang, Minlie
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3449 - 3460
  • [7] RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue
    Shi, Zhengliang
    Sun, Weiwei
    Zhang, Shuo
    Zhang, Zhen
    Ren, Pengjie
    Ren, Zhaochun
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 12856 - 12875
  • [8] Profile Consistency Identification for Open-domain Dialogue Agents
    Song, Haoyu
    Yan Wang
    Zhang, Wei Nan
    Zhao, Zhengyu
    Ting Liu
    Xiaojiang Liu
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6651 - 6662
  • [9] ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET
    Lee, Chia-Hsuan
    Wang, Shang-Ming
    Chang, Huan-Cheng
    Lee, Hung-Yi
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 949 - 956
  • [10] SHONGLAP: A Large Bengali Open-Domain Dialogue Corpus
    Monsur, Syed Mostofa
    Chowdhury, Sakib
    Fatemi, Md Shahrar
    Ahmed, Shafayat
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5797 - 5804