A Hierarchical Variational Neural Uncertainty Model for Stochastic Video Prediction

被引：6

作者：

Chatterjee, Moitreya ^{[1
]}

Ahuja, Narendra ^{[1
]}

Cherian, Anoop ^{[2
]}

机构：

[1] Univ Illinois, Champaign, IL 61820 USA

[2] Mitsubishi Elect Res Labs, Cambridge, MA 02139 USA

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

基金：

美国食品与农业研究所;

关键词：

D O I：

10.1109/ICCV48922.2021.00961

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Predicting the future frames of a video is a challenging task, in part due to the underlying stochastic real-world phenomena. Prior approaches to solve this task typically estimate a latent prior characterizing this stochasticity, however do not account for the predictive uncertainty of the (deep learning) model. Such approaches often derive the training signal from the mean-squared error (MSE) between the generated frame and the ground truth, which can lead to sub-optimal training, especially when the predictive uncertainty is high. Towards this end, we introduce Neural Uncertainty Quantifier (NUQ) - a stochastic quantification of the model's predictive uncertainty, and use it to weigh the MSE loss. We propose a hierarchical, variational framework to derive NUQ in a principled manner using a deep, Bayesian graphical model. Our experiments on three benchmark stochastic video prediction datasets show that our proposed framework trains more effectively compared to the state-of-the-art models (especially when the training sets are small), while demonstrating better video generation quality and diversity against several evaluation metrics.

引用

页码：9731 / 9741

页数：11

共 63 条

[41] Malinin A, 2018, ADV NEUR IN, V31
[42] Matthey Loic, 2017, ICLR POSTER
[43] McAllister R, 2019, IEEE INT CONF ROBOT, P2083, DOI [10.1109/ICRA.2019.8793552, 10.1109/icra.2019.8793552]
[44] Ranzato M., 2014, ARXIV14126604
[45] SIMULATION OF TRUNCATED NORMAL VARIABLES
ROBERT, CP
[J]. STATISTICS AND COMPUTING, 1995, 5 (02) : 121 - 125
[46] Ruiz FJR, 2016, 30 C NEURAL INFORM P, V29
[47] Recognizing human actions:: A local SVM approach
Schüldt, C
Laptev, I
Caputo, B
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, 2004, : 32 - 36
[48] Shi XJ, 2015, ADV NEUR IN, V28
[49] Srivastava N, 2015, PR MACH LEARN RES, V37, P843
[50] Relational Action Forecasting
Sun, Chen
Shrivastava, Abhinav
Vondrick, Carl
Sukthankar, Rahul
Murphy, Kevin
Schmid, Cordelia
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 273 - 283

← 1 2 3 4 5 6 7 →