Progressive Motion Boosting for Video Frame Interpolation

被引：2

作者：

Xiao, Jing ^{[1
]}

Xu, Kangmin ^{[1
]}

Hu, Mengshun ^{[1
]}

Liao, Liang ^{[2
]}

Wang, Zheng ^{[1
]}

Lin, Chia-Wen ^{[3
,4
]}

Wang, Mi ^{[5
]}

Satoh, Shin'ichi ^{[6
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore

[3] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu 30013, Taiwan

[4] Natl Tsing Hua Univ, Inst Commun Engn, Hsinchu 30013, Taiwan

[5] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & Re, Wuhan 430072, Peoples R China

[6] Natl Inst Informat, Digital Content & Media Sci Res Div, Tokyo 1018430, Japan

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

基金：

中国国家自然科学基金;

关键词：

Frame interpolation; motion estimation; multi-scale framework; progressive boosting;

D O I：

10.1109/TMM.2022.3233310

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video frame interpolation has made great progress in estimating advanced optical flow and synthesizing in-between frames sequentially. However, frame interpolation involving various resolutions and motions remains challenging due to limited or fixed pre-trained networks. Inspired by the success of the coarse-to-fine scheme for video frame interpolation, i.e., gradually interpolating frames of different resolutions, we propose a progressive boosting network (ProBoost-Net) based on a multi-scale framework to achieve flexible recurrent scales and then gradually optimize optical flow estimation and frame interpolation. Specifically, we designed a dense motion boosting (DMB) module to transfer features close to real motion to the decoded features from the later scales, which provides complementary information to refine the motion further. Furthermore, to ensure the accuracy of the estimated motion features at each scale, we propose a motion adaptive fusion (MAF) module that adaptively deals with motions with different receptive fields according to the motion conditions. Thanks to the framework's flexible recurrent scales, we can customize the number of scales and make trade-offs between computation and quality depending on the application scenario. Extensive experiments with various datasets demonstrated the superiority of our proposed method over state-of-the-art approaches in various scenarios.

引用

页码：8076 / 8090

页数：15

共 74 条

[1] A Fast 4K Video Frame Interpolation Using a Hybrid Task-Based Convolutional Neural Network [J].

Ahn, Ha-Eun ;

Jeong, Jinwoo ;

Kim, Je Woo .

SYMMETRY-BASEL, 2019, 11 (05)

[2]

[Anonymous], 2021, Xiph.org video test media

[3] A Database and Evaluation Methodology for Optical Flow [J].

Baker, Simon ;

Scharstein, Daniel ;

Lewis, J. P. ;

Roth, Stefan ;

Black, Michael J. ;

Szeliski, Richard .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 92 (01) :1-31

[4] Depth-Aware Video Frame Interpolation [J].

Bao, Wenbo ;

Lai, Wei-Sheng ;

Ma, Chao ;

Zhang, Xiaoyun ;

Gao, Zhiyong ;

Yang, Ming-Hsuan .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3698-3707

[5] MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement [J].

Bao, Wenbo ;

Lai, Wei-Sheng ;

Zhang, Xiaoyun ;

Gao, Zhiyong ;

Yang, Ming-Hsuan .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) :933-948

[6]

CHARBONNIER P, 1994, IEEE IMAGE PROC, P168

[7] Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution [J].

Cheng, Xianhang ;

Chen, Zhenzhong .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) :7029-7045

[8]

Cheng XH, 2020, AAAI CONF ARTIF INTE, V34, P10607

[9]

Dai SY, 2007, 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, P1039

[10] FlowNet: Learning Optical Flow with Convolutional Networks [J].

Dosovitskiy, Alexey ;

Fischer, Philipp ;

Ilg, Eddy ;

Haeusser, Philip ;

Hazirbas, Caner ;

Golkov, Vladimir ;

van der Smagt, Patrick ;

Cremers, Daniel ;

Brox, Thomas .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2758-2766

← 1 2 3 4 5 6 7 8 →