A Dynamic Multi-Scale Voxel Flow Network for Video Prediction

被引:29
|
作者
Hu, Xiaotao [1 ,2 ]
Huang, Zhewei [2 ]
Huang, Ailin [2 ,3 ]
Xu, Jun [4 ]
Zhou, Shuchang [2 ]
机构
[1] Nankai Univ, Coll Comp Sci, Tianjin, Peoples R China
[2] Megvii Technol, Beijing, Peoples R China
[3] Wuhan Univ, Wuhan, Peoples R China
[4] Nankai Univ, Sch Stat & Data Sci, Tianjin, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.00593
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of video prediction has been greatly boosted by advanced deep neural networks. However, most of the current methods suffer from large model sizes and require extra inputs, e.g., semantic/depth maps, for promising performance. For efficiency consideration, in this paper, we propose a Dynamic Multi-scale Voxel Flow Network (DMVFN) to achieve better video prediction performance at lower computational costs with only RGB images, than previous methods. The core of our DMVFN is a differentiable routing module that can effectively perceive the motion scales of video frames. Once trained, our DMVFN selects adaptive sub-networks for different inputs at the inference stage. Experiments on several benchmarks demonstrate that our DMVFN is an order of magnitude faster than Deep Voxel Flow [35] and surpasses the state-of-the-art iterative-based OPT [63] on generated image quality.
引用
收藏
页码:6121 / 6131
页数:11
相关论文
共 50 条
  • [1] Multi-scale Fusion Dynamic Graph Neural Network for Traffic Flow Prediction
    Weng, Wenchao
    Chen, Qikai
    Dai, Yu
    Chen, Jingyang
    Chen, Dongliang
    ACM International Conference Proceeding Series, 2023, : 85 - 90
  • [2] Multi-scale Siamese prediction network for video anomaly detection
    Jingxian Yang
    Yiheng Cai
    Dan Liu
    Jin Xie
    Signal, Image and Video Processing, 2023, 17 : 671 - 678
  • [3] Multi-scale Siamese prediction network for video anomaly detection
    Yang, Jingxian
    Cai, Yiheng
    Liu, Dan
    Xie, Jin
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (03) : 671 - 678
  • [4] Multi-Scale Spatiotemporal Feature Fusion Network for Video Saliency Prediction
    Zhang, Yunzuo
    Zhang, Tian
    Wu, Cunyu
    Tao, Ran
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4183 - 4193
  • [5] Dynamic multi-scale spatial-temporal graph convolutional network for traffic flow prediction
    Gao, Ming
    Du, Zhuoran
    Qin, Hongmao
    Wang, Wei
    Jin, Guangyin
    Xie, Guotao
    KNOWLEDGE-BASED SYSTEMS, 2024, 305
  • [6] Dynamic multi-scale spatial-temporal graph convolutional network for traffic flow prediction
    Hu, Na
    Zhang, Dafang
    Xie, Kun
    Liang, Wei
    Li, Kuan-Ching
    Zomaya, Albert Y.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 158 : 323 - 332
  • [7] Multi-Scale Convolutional Neural Network-Based Intra Prediction for Video Coding
    Wang, Yang
    Fan, Xiaopeng
    Liu, Shaohui
    Zhao, Debin
    Gao, Wen
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 1803 - 1815
  • [8] Transformer-Based Multi-Scale Feature Integration Network for Video Saliency Prediction
    Zhou, Xiaofei
    Wu, Songhe
    Shi, Ran
    Zheng, Bolun
    Wang, Shuai
    Yin, Haibing
    Zhang, Jiyong
    Yan, Chenggang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7696 - 7707
  • [9] MULTI-SCALE PREDICTION NETWORK FOR LUNG SEGMENTATION
    Gu, Yuchong
    Lai, Yaoming
    Xie, Peiliang
    Wei, Jun
    Lu, Yao
    2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 438 - 442
  • [10] GLSNN Network: A Multi-Scale Spatiotemporal Prediction Model for Urban Traffic Flow
    Cai, Benhe
    Wang, Yanhui
    Huang, Chong
    Liu, Jiahao
    Teng, Wenxin
    SENSORS, 2022, 22 (22)