Sentiment-Oriented Transformer-Based Variational Autoencoder Network for Live Video Commenting

被引:3
作者
Fu, Fengyi [1 ]
Fang, Shancheng [1 ]
Chen, Weidong [1 ]
Mao, Zhendong [1 ]
机构
[1] Univ Sci & Technol China, 100 Fuxing Rd, Hefei 230000, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Automatic live video commenting; multi-modal learning; variational autoencoder; batch attention mechanism;
D O I
10.1145/3633334
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic live video commenting is getting increasing attention due to its significance in narration generation, topic explanation, etc. However, the diverse sentiment consideration of the generated comments is missing from current methods. Sentimental factors are critical in interactive commenting, and there has been lack of research so far. Thus, in this article, we propose a Sentiment-oriented Transformer-based Variational Autoencoder (So-TVAE) network, which consists of a sentiment-oriented diversity encoder module and a batch attention module, to achieve diverse video commenting with multiple sentiments and multiple semantics. Specifically, our sentiment-oriented diversity encoder elegantly combines a VAE and random mask mechanism to achieve semantic diversity under sentiment guidance, which is then fused with cross-modal features to generate live video comments. A batch attention module is also proposed in this article to alleviate the problem of missing sentimental samples, caused by the data imbalance that is common in live videos as the popularity of videos varies. Extensive experiments on Livebot and VideoIC datasets demonstrate that the proposed So-TVAE outperforms the state-of-the-art methods in terms of the quality and diversity of generated comments. Related code is available at https://github.com/fufy1024/So-TVAE.
引用
收藏
页数:24
相关论文
共 117 条
  • [111] Zhao WT, 2020, AAAI CONF ARTIF INTE, V34, P12984
  • [112] Compression artifacts reduction by improved generative adversarial networks
    Zhao, Zengshun
    Sun, Qian
    Yang, Haoran
    Qiao, Heng
    Wang, Zhigang
    Wu, Dapeng Oliver
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2019, 2019 (1)
  • [113] Zhihan Zhang, 2020, Natural Language Processing and Chinese Computing. 9th CCF International Conference, NLPCC 2020. Proceedings. Lecture Notes in Artificial Intelligence Subseries of Lecture Notes in Computer Science (LNAI 12431), P3, DOI 10.1007/978-3-030-60457-8_1
  • [114] Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
    Zhou, Chunting
    Neubig, Graham
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 310 - 320
  • [115] Zhu QL, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P2636
  • [116] Texygen: A Benchmarking Platform for Text Generation Models
    Zhu, Yaoming
    Lu, Sidi
    Zheng, Lei
    Guo, Jiaxian
    Zhang, Weinan
    Wang, Jun
    Yu, Yong
    [J]. ACM/SIGIR PROCEEDINGS 2018, 2018, : 1097 - 1100
  • [117] A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
    Zhu, Yizhe
    Elhoseiny, Mohamed
    Liu, Bingchen
    Peng, Xi
    Elgammal, Ahmed
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1004 - 1013