ASVFI: AUDIO-DRIVEN SPEAKER VIDEO FRAME INTERPOLATION

被引：0

作者：

Wang, Qianrui ^{[1
]}

Li, Dengshi ^{[1
]}

Liao, Liang ^{[2
]}

Song, Hao ^{[1
]}

Li, Wei ^{[1
]}

Xiao, Jing ^{[3
]}

机构：

[1] Jianghan Univ, Sch Artificial Intelligence, Wuhan, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore

[3] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Wuhan, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年

关键词：

Speaker video; video frame interpolation; audio;

D O I：

10.1109/ICIP49359.2023.10222345

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to limited data transmission, the video frame rate is low during the online conference, severely affecting user experience. Video frame interpolation can solve the problem by interpolating intermediate frames to increase the video frame rate. Generally, most existing video frame interpolation methods are based on the linear motion assumption. However, the mouth motion is nonlinear, and these methods can not generate superior intermediate frames in speaker video. Considering the strong correlation between mouth shape and vocalization, a new method is proposed, named Audio-driven Speaker Video Frame Interpolation(ASVFI). First, we extract the audio feature from Audio Net(ANet). Second, we use Video Net(VNet) encoder to extract the video feature. Finally, we fuse the audio and video features by AVFusion and decode out the intermediate frame in the VNet decoder. The experimental results show that the PSNR is nearly 0.13dB higher than the baseline of interpolating one frame. When interpolating seven frames, the PSNR is 0.33dB higher than the baseline.

引用

页码：3200 / 3204

页数：5

共 50 条

[41] Parallel Spatio-Temporal Attention Transformer for Video Frame Interpolation
Ning, Xin
Cai, Feifan
Li, Yuhang
Ding, Youdong
ELECTRONICS, 2024, 13 (10)
[42] FLAVR: flow-free architecture for fast video frame interpolation
Tarun Kalluri
Deepak Pathak
Manmohan Chandraker
Du Tran
Machine Vision and Applications, 2023, 34
[43] VIDEO FRAME INTERPOLATION VIA EXCEPTIONAL MOTION-AWARE SYNTHESIS
Park, Minho
Lee, Sangmin
Ro, Yong Man
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1958 - 1962
[44] An Analytical Study of CNN-based Video Frame Interpolation Techniques
Pandya, Kshitija
Varshney, Disha
Aggarwal, Ashray
Parihar, Anil Singh
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 1124 - 1131
[45] Progressive Spatial-temporal Collaborative Network for Video Frame Interpolation
Hu, Mengshun
Jiang, Kui
Liao, Liang
Nie, Zhixiang
Xiao, Jing
Wang, Zheng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2145 - 2153
[46] EFFICIENT CONVOLUTION AND TRANSFORMER-BASED NETWORK FOR VIDEO FRAME INTERPOLATION
Khalifeh, Issa
Murn, Luka
Mrak, Marta
Izquierdo, Ebroul
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1050 - 1054
[47] Progressive Motion Context Refine Network for Efficient Video Frame Interpolation
Kong, Lingtong
Liu, Jinfeng
Yang, Jie
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2338 - 2342
[48] Multi-scale Intermediate Flow Estimation for Video Frame Interpolation
Fan, Zehua
Zhu, Feng
Li, Lei
Tan, Xiaoyang
2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 893 - 900
[49] Arbitrary Timestep Video Frame Interpolation with Time-Dependent Decoding
Zhang, Haokai
Ren, Dongwei
Yan, Zifei
Zuo, Wangmeng
MATHEMATICS, 2024, 12 (02)
[50] Flow-aware synthesis: A generic motion model for video frame interpolation
Jinbo Xing
Wenbo Hu
Yuechen Zhang
Tien-Tsin Wong
Computational Visual Media, 2021, 7 : 393 - 405

← 1 2 3 4 5 →