A Hierarchical Variational Neural Uncertainty Model for Stochastic Video Prediction

被引:6
作者
Chatterjee, Moitreya [1 ]
Ahuja, Narendra [1 ]
Cherian, Anoop [2 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] Mitsubishi Elect Res Labs, Cambridge, MA 02139 USA
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
基金
美国食品与农业研究所;
关键词
D O I
10.1109/ICCV48922.2021.00961
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predicting the future frames of a video is a challenging task, in part due to the underlying stochastic real-world phenomena. Prior approaches to solve this task typically estimate a latent prior characterizing this stochasticity, however do not account for the predictive uncertainty of the (deep learning) model. Such approaches often derive the training signal from the mean-squared error (MSE) between the generated frame and the ground truth, which can lead to sub-optimal training, especially when the predictive uncertainty is high. Towards this end, we introduce Neural Uncertainty Quantifier (NUQ) - a stochastic quantification of the model's predictive uncertainty, and use it to weigh the MSE loss. We propose a hierarchical, variational framework to derive NUQ in a principled manner using a deep, Bayesian graphical model. Our experiments on three benchmark stochastic video prediction datasets show that our proposed framework trains more effectively compared to the state-of-the-art models (especially when the training sets are small), while demonstrating better video generation quality and diversity against several evaluation metrics.
引用
收藏
页码:9731 / 9741
页数:11
相关论文
共 63 条
  • [41] Malinin A, 2018, ADV NEUR IN, V31
  • [42] Matthey Loic, 2017, ICLR POSTER
  • [43] McAllister R, 2019, IEEE INT CONF ROBOT, P2083, DOI [10.1109/ICRA.2019.8793552, 10.1109/icra.2019.8793552]
  • [44] Ranzato M., 2014, ARXIV14126604
  • [45] SIMULATION OF TRUNCATED NORMAL VARIABLES
    ROBERT, CP
    [J]. STATISTICS AND COMPUTING, 1995, 5 (02) : 121 - 125
  • [46] Ruiz FJR, 2016, 30 C NEURAL INFORM P, V29
  • [47] Recognizing human actions:: A local SVM approach
    Schüldt, C
    Laptev, I
    Caputo, B
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, 2004, : 32 - 36
  • [48] Shi XJ, 2015, ADV NEUR IN, V28
  • [49] Srivastava N, 2015, PR MACH LEARN RES, V37, P843
  • [50] Relational Action Forecasting
    Sun, Chen
    Shrivastava, Abhinav
    Vondrick, Carl
    Sukthankar, Rahul
    Murphy, Kevin
    Schmid, Cordelia
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 273 - 283