An Efficient Multi-Scale Attention Feature Fusion Network for 4K Video Frame Interpolation

被引:1
作者
Ning, Xin [1 ]
Li, Yuhang [1 ]
Feng, Ziwei [1 ]
Liu, Jinhua [1 ]
Ding, Youdong [1 ,2 ]
机构
[1] Shanghai Univ, Coll Shanghai Film, 788 Guangzhong Rd, Shanghai 200072, Peoples R China
[2] Shanghai Engn Res Ctr Mot Picture Special Effects, 788 Guangzhong Rd, Shanghai 200072, Peoples R China
基金
中国国家自然科学基金;
关键词
4K video frame interpolation; 4K video dataset; self-attention; multi-scale; high frame rate;
D O I
10.3390/electronics13061037
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video frame interpolation aims to generate intermediate frames in a video to showcase finer details. However, most methods are only trained and tested on low-resolution datasets, lacking research on 4K video frame interpolation problems. This limitation makes it challenging to handle high-frame-rate video processing in real-world scenarios. In this paper, we propose a 4K video dataset at 120 fps, named UHD4K120FPS, which contains large motion. We also propose a novel framework for solving the 4K video frame interpolation task, based on a multi-scale pyramid network structure. We introduce self-attention to capture long-range dependencies and self-similarities in pixel space, which overcomes the limitations of convolutional operations. To reduce computational cost, we use a simple mapping-based approach to lighten self-attention, while still allowing for content-aware aggregation weights. Through extensive quantitative and qualitative experiments, we demonstrate the excellent performance achieved by our proposed model on the UHD4K120FPS dataset, as well as illustrate the effectiveness of our method for 4K video frame interpolation. In addition, we evaluate the robustness of the model on low-resolution benchmark datasets.
引用
收藏
页数:16
相关论文
共 33 条
  • [1] A Fast 4K Video Frame Interpolation Using a Hybrid Task-Based Convolutional Neural Network
    Ahn, Ha-Eun
    Jeong, Jinwoo
    Kim, Je Woo
    [J]. SYMMETRY-BASEL, 2019, 11 (05):
  • [2] A Database and Evaluation Methodology for Optical Flow
    Baker, Simon
    Scharstein, Daniel
    Lewis, J. P.
    Roth, Stefan
    Black, Michael J.
    Szeliski, Richard
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 92 (01) : 1 - 31
  • [3] Depth-Aware Video Frame Interpolation
    Bao, Wenbo
    Lai, Wei-Sheng
    Ma, Chao
    Zhang, Xiaoyun
    Gao, Zhiyong
    Yang, Ming-Hsuan
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3698 - 3707
  • [4] Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution
    Cheng, Xianhang
    Chen, Zhenzhong
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7029 - 7045
  • [5] Choi M, 2020, AAAI CONF ARTIF INTE, V34, P10663
  • [6] CDFI: Compression-Driven Network Design for Frame Interpolation
    Ding, Tianyu
    Liang, Luming
    Zhu, Zhihui
    Zharkov, Ilya
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7997 - 8007
  • [7] Space-Time-Aware Multi-Resolution Video Enhancement
    Haris, Muhammad
    Shakhnarovich, Greg
    Ukita, Norimichi
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2856 - 2865
  • [8] Real-Time Intermediate Flow Estimation for Video Frame Interpolation
    Huang, Zhewei
    Zhang, Tianyuan
    Heng, Wen
    Shi, Boxin
    Zhou, Shuchang
    [J]. COMPUTER VISION - ECCV 2022, PT XIV, 2022, 13674 : 624 - 642
  • [9] Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
    Jiang, Huaizu
    Sun, Deqing
    Jampani, Varun
    Yang, Ming-Hsuan
    Learned-Miller, Erik
    Kautz, Jan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9000 - 9008
  • [10] Junheum Park, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12359), P109, DOI 10.1007/978-3-030-58568-6_7