Commonsense Knowledge Aware Concept Selection for Diverse and Informative Visual Storytelling

被引:0
|
作者
Chen, Hong [1 ,3 ]
Huang, Yifei [1 ]
Takamura, Hiroya [2 ,3 ]
Nakayama, Hideki [1 ,3 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Tokyo Inst Technol, Tokyo, Japan
[3] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual storytelling is a task of generating relevant and interesting stories for given image sequences. In this work we aim at increasing the diversity of the generated stories while preserving the informative content from the images. We propose to foster the diversity and informativeness of a generated story by using a concept selection module that suggests a set of concept candidates. Then, we utilize a large scale pretrained model to convert concepts and images into full stories. To enrich the candidate concepts, a commonsense knowledge graph is created for each image sequence from which the concept candidates are proposed. To obtain appropriate concepts from the graph, we propose two novel modules that consider the correlation among candidate concepts and the image-concept correlation. Extensive automatic and human evaluation results demonstrate that our model can produce reasonable concepts. This enables our model to outperform the previous models by a large margin on the diversity and informativeness of the story, while retaining the relevance of the story to the image sequence.
引用
收藏
页码:999 / 1008
页数:10
相关论文
共 50 条
  • [1] SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling
    Wang, Eileen
    Han, Soyeon Caren
    Poon, Josiah
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1602 - 1616
  • [2] Building a Commonsense Knowledge Base for a Collaborative Storytelling Agent
    Ong, Dionne Tiffany
    De Jesus, Christine Rachel
    Gilig, Luisa Katherine
    Alburo, Junlyn Bryan
    Ong, Ethel
    KNOWLEDGE MANAGEMENT AND ACQUISITION FOR INTELLIGENT SYSTEMS (PKAW 2018), 2018, 11016 : 1 - 15
  • [3] Informative Visual Storytelling with Cross-modal Rules
    Li, Jiacheng
    Shi, Haizhou
    Tang, Siliang
    Wu, Fei
    Zhuang, Yueting
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2314 - 2322
  • [4] DIVE: Towards Descriptive and Diverse Visual Commonsense Generation
    Park, Jun-Hyung
    Park, Hyuntae
    Kang, Youjin
    Jeon, Eojin
    Lee, SangKeun
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 9677 - 9695
  • [5] Knowledge Graph Compression Enhances Diverse Commonsense Generation
    Hwang, EunJeong
    Thost, Veronika
    Shwartz, Vered
    Ma, Tengfei
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 558 - 572
  • [6] Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling
    Yang, Pengcheng
    Luo, Fuli
    Chen, Peng
    Li, Lei
    Yin, Zhiyi
    He, Xiaodong
    Sun, Xu
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 5356 - 5362
  • [7] Variational Attention for Commonsense Knowledge Aware Conversation Generation
    Bai, Guirong
    He, Shizhu
    Liu, Kang
    Zhao, Jun
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 3 - 15
  • [8] Commonsense Knowledge Aware Conversation Generation with Graph Attention
    Zhou, Hao
    Young, Tom
    Huang, Minlie
    Zhao, Haizhou
    Xu, Jingfang
    Zhu, Xiaoyan
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4623 - 4629
  • [9] Emotion Aware Reinforcement Network for Visual Storytelling
    Li, Xin
    Cai, Hanqing
    Jiang, Tianling
    Liu, Chunping
    Ji, Yi
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 26 - 37
  • [10] Knowledge-Enriched Visual Storytelling
    Hsu, Chao-Chun
    Chen, Zi-Yuan
    Hsu, Chi-Yang
    Li, Chih-Chia
    Lin, Tzu-Yuan
    Huang, Ting-Hao
    Ku, Lun-Wei
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7952 - 7960