A Hierarchical Variational Neural Uncertainty Model for Stochastic Video Prediction

被引:6
作者
Chatterjee, Moitreya [1 ]
Ahuja, Narendra [1 ]
Cherian, Anoop [2 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] Mitsubishi Elect Res Labs, Cambridge, MA 02139 USA
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
基金
美国食品与农业研究所;
关键词
D O I
10.1109/ICCV48922.2021.00961
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predicting the future frames of a video is a challenging task, in part due to the underlying stochastic real-world phenomena. Prior approaches to solve this task typically estimate a latent prior characterizing this stochasticity, however do not account for the predictive uncertainty of the (deep learning) model. Such approaches often derive the training signal from the mean-squared error (MSE) between the generated frame and the ground truth, which can lead to sub-optimal training, especially when the predictive uncertainty is high. Towards this end, we introduce Neural Uncertainty Quantifier (NUQ) - a stochastic quantification of the model's predictive uncertainty, and use it to weigh the MSE loss. We propose a hierarchical, variational framework to derive NUQ in a principled manner using a deep, Bayesian graphical model. Our experiments on three benchmark stochastic video prediction datasets show that our proposed framework trains more effectively compared to the state-of-the-art models (especially when the training sets are small), while demonstrating better video generation quality and diversity against several evaluation metrics.
引用
收藏
页码:9731 / 9741
页数:11
相关论文
共 63 条
  • [1] [Anonymous], 2014, ARXIV14117610
  • [2] [Anonymous], 2016, ADV NEURAL INFORM PR, DOI DOI 10.48550/ARXIV.1605.07157
  • [3] Ardizzone L., 2020, P ADV NEUR INF PROC, V33, P7828
  • [4] Babaeizadeh Mohammad, 2018, INT C LEARNING REPRE
  • [5] CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training
    Bao, Jianmin
    Chen, Dong
    Wen, Fang
    Li, Houqiang
    Hua, Gang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2764 - 2773
  • [6] HP-GAN: Probabilistic 3D human motion prediction via GAN
    Barsoum, Emad
    Kender, John
    Liu, Zicheng
    [J]. PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1499 - 1508
  • [7] Blundell C, 2015, PR MACH LEARN RES, V37, P1613
  • [8] ContextVP: Fully Context-Aware Video Prediction
    Byeon, Wonmin
    Wang, Qin
    Srivastava, Rupesh Kumar
    Koumoutsakos, Petros
    [J]. COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 781 - 797
  • [9] Castrejon Lluis, 2019, P IEEE INT C COMP VI
  • [10] Rate bounds on SSIM index of quantized images
    Channappayya, Sumohana S.
    Bovik, Alan Conrad
    Heath, Robert W.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2008, 17 (09) : 1624 - 1639