Speak From Heart: An Emotion-Guided LLM-Based Multimodal Method for Emotional Dialogue Generation

被引:1
作者
Liu, Chenxiao [1 ]
Xie, Zheyong [1 ]
Zhao, Sirui [1 ]
Zhou, Jin [1 ]
Xu, Tong [1 ]
Li, Minglei [2 ]
Chen, Enhong [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Huawei Cloud, Shenzhen, Peoples R China
来源
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024 | 2024年
基金
中国国家自然科学基金;
关键词
Large Language Models; Emotional expression; Multimodal cues; Emotional retrieval module; Dialogue systems;
D O I
10.1145/3652583.3658104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in Large Language Models (LLMs) have greatly enhanced the generation capabilities of dialogue systems. However, progress on emotional expression during dialogues might be still limited, especially when capturing and processing the multimodal cues for emotional expression. Therefore, it is urgent to fully adapt the multimodal understanding ability and transferability of LLMs to enhance the emotional-oriented multimodal processing capabilities. To that end, in this paper, we propose a novel Emotion-Guided Multimodal Dialogue model based on LLM, termed ELMD. Specifically, to enhance the emotional expression ability of LLMs, our ELMD customizes an emotional retrieval module, which mainly provides appropriate response demonstration for LLM in understanding emotional context. Subsequently, a two-stage training strategy is proposed, founded on previous demonstration support, to support uncovering nuanced emotions behind multimodal information and constructing natural responses. Comprehensive experiments demonstrate the effectiveness and superiority of ELMD.
引用
收藏
页码:533 / 542
页数:10
相关论文
共 42 条
  • [21] Maaz M, 2024, Arxiv, DOI arXiv:2306.05424
  • [22] Madasu A, 2023, 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, P73
  • [23] Majumder N, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P8968
  • [24] Majumder N, 2019, AAAI CONF ARTIF INTE, P6818
  • [25] Multimodal Dialog System: Generating Responses via Adaptive Decoders
    Nie, Liqiang
    Wang, Wenjie
    Hong, Richang
    Wang, Meng
    Tian, Qi
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1098 - 1106
  • [26] BLEU: a method for automatic evaluation of machine translation
    Papineni, K
    Roukos, S
    Ward, T
    Zhu, WJ
    [J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 311 - 318
  • [27] Radford A., 2019, OPENAI BLOG, V1, P9, DOI DOI 10.4018/978-1-5225-9348-5.CH006
  • [28] Rashkin H, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P5370
  • [29] Shen WZ, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), P1551
  • [30] Song ZQ, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P3685