Video frame interpolation via spatial multi-scale modelling

被引:0
作者
Qu, Zhe [1 ]
Liu, Weijing [2 ]
Cui, Lizhen [1 ]
Yang, Xiaohui [2 ]
机构
[1] Shandong Univ, Sch Software, Jinan, Peoples R China
[2] Univ Jinan, Sch Informat Sci & Engn, Jinan, Peoples R China
基金
国家重点研发计划;
关键词
computer vision; image motion analysis; learning (artificial intelligence); video signal processing;
D O I
10.1049/cvi2.12281
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video frame interpolation (VFI) is a technique that synthesises intermediate frames between adjacent original video frames to enhance the temporal super-resolution of the video. However, existing methods usually rely on heavy model architectures with a large number of parameters. The authors introduce an efficient VFI network based on multiple lightweight convolutional units and a Local three-scale encoding (LTSE) structure. In particular, the authors introduce a LTSE structure with two-level attention cascades. This design is tailored to enhance the efficient capture of details and contextual information across diverse scales in images. Secondly, the authors introduce recurrent convolutional layers (RCL) and residual operations, designing the recurrent residual convolutional unit to optimise the LTSE structure. Additionally, a lightweight convolutional unit named separable recurrent residual convolutional unit is introduced to reduce the model parameters. Finally, the authors obtain the three-scale decoding features from the decoder and warp them for a set of three-scale pre-warped maps. The authors fuse them into the synthesis network to generate high-quality interpolated frames. The experimental results indicate that the proposed approach achieves superior performance with fewer model parameters. This is a revised version of the authors' manuscript, which proposes a lightweight VFI network based on multiple lightweight convolutional units and the three-scale encoding-decoding structure. The proposed model learns features in an adaptive method to ensure an effective inference of motion information. Moreover, the authors design a lightweight convolutional unit S_RRCU to decrease the model parameters. image
引用
收藏
页码:458 / 472
页数:15
相关论文
共 58 条
[1]   A database and evaluation methodology for optical flow [J].
Baker, Simon ;
Scharstein, Daniel ;
Lewis, J. P. ;
Roth, Stefan ;
Black, Michael J. ;
Szeliski, Richard .
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :588-595
[2]   Depth-Aware Video Frame Interpolation [J].
Bao, Wenbo ;
Lai, Wei-Sheng ;
Ma, Chao ;
Zhang, Xiaoyun ;
Gao, Zhiyong ;
Yang, Ming-Hsuan .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3698-3707
[3]   MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement [J].
Bao, Wenbo ;
Lai, Wei-Sheng ;
Zhang, Xiaoyun ;
Gao, Zhiyong ;
Yang, Ming-Hsuan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) :933-948
[4]   Learning to Synthesize Motion Blur [J].
Brooks, Tim ;
Barron, Jonathan T. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6833-6841
[5]   Image Sequence Interpolation Using Optimal Control [J].
Chen, Kanglin ;
Lorenz, Dirk A. .
JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2011, 41 (03) :222-238
[6]   Orthant Based Proximal Stochastic Gradient Method for l1-Regularized Optimization [J].
Chen, Tianyi ;
Ding, Tianyu ;
Ji, Bo ;
Wang, Guanyi ;
Shi, Yixin ;
Tian, Jing ;
Yi, Sheng ;
Tu, Xiao ;
Zhu, Zhihui .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT III, 2021, 12459 :57-73
[7]   Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution [J].
Cheng, Xianhang ;
Chen, Zhenzhong .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) :7029-7045
[8]   All at Once: Temporally Adaptive Multi-frame Interpolation with Advanced Motion Modeling [J].
Chi, Zhixiang ;
Nasiri, Rasoul Mohammadi ;
Liu, Zheng ;
Lu, Juwei ;
Tang, Jin ;
Plataniotis, Konstantinos N. .
COMPUTER VISION - ECCV 2020, PT XXVII, 2020, 12372 :107-123
[9]   Motion-compensated frame interpolation using bilateral motion estimation and adaptive overlapped block motion compensation [J].
Choi, Byeong-Doo ;
Han, Jong-Woo ;
Kim, Chang-Su ;
Ko, Sung-Jea .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2007, 17 (04) :407-416
[10]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773