ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation

被引:29
作者
Danier, Duolikun [1 ]
Zhang, Fan [1 ]
Bull, David [1 ]
机构
[1] Univ Bristol, Bristol, Avon, England
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.00351
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video frame interpolation (VFI) is currently a very active research topic, with applications spanning computer vision, post production and video encoding. VFI can be extremely challenging, particularly in sequences containing large motions, occlusions or dynamic textures, where existing approaches fail to offer perceptually robust inter-polation performance. In this context, we present a novel deep learning based VFI method, ST-MFNet, based on a Spatio-Temporal Multi-Flow architecture. ST-MFNet employs a new multi-scale multi-flow predictor to estimate many-to-one intermediate flows, which are combined with conventional one-to-one optical flows to capture both large and complex motions. In order to enhance interpolation performance for various textures, a 3D CNN is also employed to model the content dynamics over an extended temporal window. Moreover, ST-MFNet has been trained within an ST-GAN framework, which was originally developedfor texture synthesis, with the aim of further improving perceptual interpolation quality. Our approach has been comprehensively evaluated - compared with fourteen state-of-the-art VFI algorithms - clearly demonstrating that ST-MFNet consistently outperforms these benchmarks on var-ied and representative test datasets, with significant gains up to 1.09dB in PSNR for cases including large motions and dynamic textures. Our source code is available at https://github.com/danielism97/ST-MFNet.
引用
收藏
页码:3511 / 3521
页数:11
相关论文
共 63 条
[1]  
[Anonymous], 2020, COMPUTER VISION ECCV, DOI DOI 10.1109/BIBM49941.2020.9313130
[2]   A database and evaluation methodology for optical flow [J].
Baker, Simon ;
Scharstein, Daniel ;
Lewis, J. P. ;
Roth, Stefan ;
Black, Michael J. ;
Szeliski, Richard .
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :588-595
[3]   Depth-Aware Video Frame Interpolation [J].
Bao, Wenbo ;
Lai, Wei-Sheng ;
Ma, Chao ;
Zhang, Xiaoyun ;
Gao, Zhiyong ;
Yang, Ming-Hsuan .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3698-3707
[4]   MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement [J].
Bao, Wenbo ;
Lai, Wei-Sheng ;
Zhang, Xiaoyun ;
Gao, Zhiyong ;
Yang, Ming-Hsuan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) :933-948
[5]   The Perception-Distortion Tradeoff [J].
Blau, Yochai ;
Michaeli, Tomer .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6228-6237
[6]  
Bojanowski P., 2017, PROC INT C MACH LEAR
[7]  
Bucilua C., 2006, P ACM INT C KNOWLEDG, P535
[8]  
Chandraker Manmohan, 2020, ARXIV201208512
[9]  
Chen Zhiqi, 2021, IEEE OPEN J SIGNAL P, P1
[10]  
Cheng Xianhang, 2021, IEEE T PATTERN ANAL