A Dynamic Multi-Scale Voxel Flow Network for Video Prediction

被引：29

作者：

Hu, Xiaotao ^{[1
,2
]}

Huang, Zhewei ^{[2
]}

Huang, Ailin ^{[2
,3
]}

Xu, Jun ^{[4
]}

Zhou, Shuchang ^{[2
]}

机构：

[1] Nankai Univ, Coll Comp Sci, Tianjin, Peoples R China

[2] Megvii Technol, Beijing, Peoples R China

[3] Wuhan Univ, Wuhan, Peoples R China

[4] Nankai Univ, Sch Stat & Data Sci, Tianjin, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52729.2023.00593

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The performance of video prediction has been greatly boosted by advanced deep neural networks. However, most of the current methods suffer from large model sizes and require extra inputs, e.g., semantic/depth maps, for promising performance. For efficiency consideration, in this paper, we propose a Dynamic Multi-scale Voxel Flow Network (DMVFN) to achieve better video prediction performance at lower computational costs with only RGB images, than previous methods. The core of our DMVFN is a differentiable routing module that can effectively perceive the motion scales of video frames. Once trained, our DMVFN selects adaptive sub-networks for different inputs at the inference stage. Experiments on several benchmarks demonstrate that our DMVFN is an order of magnitude faster than Deep Voxel Flow [35] and surpasses the state-of-the-art iterative-based OPT [63] on generated image quality.

引用

页码：6121 / 6131

页数：11

共 50 条

[1] Multi-scale Fusion Dynamic Graph Neural Network for Traffic Flow Prediction
Weng, Wenchao
Chen, Qikai
Dai, Yu
Chen, Jingyang
Chen, Dongliang
ACM International Conference Proceeding Series, 2023, : 85 - 90
[2] Multi-scale Siamese prediction network for video anomaly detection
Jingxian Yang
Yiheng Cai
Dan Liu
Jin Xie
Signal, Image and Video Processing, 2023, 17 : 671 - 678
[3] Multi-scale Siamese prediction network for video anomaly detection
Yang, Jingxian
Cai, Yiheng
Liu, Dan
Xie, Jin
SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (03) : 671 - 678
[4] Multi-Scale Spatiotemporal Feature Fusion Network for Video Saliency Prediction
Zhang, Yunzuo
Zhang, Tian
Wu, Cunyu
Tao, Ran
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4183 - 4193
[5] Dynamic multi-scale spatial-temporal graph convolutional network for traffic flow prediction
Gao, Ming
Du, Zhuoran
Qin, Hongmao
Wang, Wei
Jin, Guangyin
Xie, Guotao
KNOWLEDGE-BASED SYSTEMS, 2024, 305
[6] Dynamic multi-scale spatial-temporal graph convolutional network for traffic flow prediction
Hu, Na
Zhang, Dafang
Xie, Kun
Liang, Wei
Li, Kuan-Ching
Zomaya, Albert Y.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 158 : 323 - 332
[7] Multi-Scale Convolutional Neural Network-Based Intra Prediction for Video Coding
Wang, Yang
Fan, Xiaopeng
Liu, Shaohui
Zhao, Debin
Gao, Wen
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 1803 - 1815
[8] Transformer-Based Multi-Scale Feature Integration Network for Video Saliency Prediction
Zhou, Xiaofei
Wu, Songhe
Shi, Ran
Zheng, Bolun
Wang, Shuai
Yin, Haibing
Zhang, Jiyong
Yan, Chenggang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7696 - 7707
[9] MULTI-SCALE PREDICTION NETWORK FOR LUNG SEGMENTATION
Gu, Yuchong
Lai, Yaoming
Xie, Peiliang
Wei, Jun
Lu, Yao
2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 438 - 442
[10] GLSNN Network: A Multi-Scale Spatiotemporal Prediction Model for Urban Traffic Flow
Cai, Benhe
Wang, Yanhui
Huang, Chong
Liu, Jiahao
Teng, Wenxin
SENSORS, 2022, 22 (22)

← 1 2 3 4 5 →