ASVFI: AUDIO-DRIVEN SPEAKER VIDEO FRAME INTERPOLATION

被引:0
|
作者
Wang, Qianrui [1 ]
Li, Dengshi [1 ]
Liao, Liang [2 ]
Song, Hao [1 ]
Li, Wei [1 ]
Xiao, Jing [3 ]
机构
[1] Jianghan Univ, Sch Artificial Intelligence, Wuhan, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[3] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Wuhan, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年
关键词
Speaker video; video frame interpolation; audio;
D O I
10.1109/ICIP49359.2023.10222345
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to limited data transmission, the video frame rate is low during the online conference, severely affecting user experience. Video frame interpolation can solve the problem by interpolating intermediate frames to increase the video frame rate. Generally, most existing video frame interpolation methods are based on the linear motion assumption. However, the mouth motion is nonlinear, and these methods can not generate superior intermediate frames in speaker video. Considering the strong correlation between mouth shape and vocalization, a new method is proposed, named Audio-driven Speaker Video Frame Interpolation(ASVFI). First, we extract the audio feature from Audio Net(ANet). Second, we use Video Net(VNet) encoder to extract the video feature. Finally, we fuse the audio and video features by AVFusion and decode out the intermediate frame in the VNet decoder. The experimental results show that the PSNR is nearly 0.13dB higher than the baseline of interpolating one frame. When interpolating seven frames, the PSNR is 0.33dB higher than the baseline.
引用
收藏
页码:3200 / 3204
页数:5
相关论文
共 50 条
  • [1] SVMFI: speaker video multi-frame interpolation with the guidance of audio
    Wang, Qianrui
    Li, Dengshi
    Gao, Yu
    Chen, Aolei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (40) : 88411 - 88428
  • [2] IBVC: Interpolation-driven B-frame video compression
    Xu, Chenming
    Liu, Meiqin
    Yao, Chao
    Lin, Weisi
    Zhao, Yao
    PATTERN RECOGNITION, 2024, 153
  • [3] Video Frame Interpolation: A Comprehensive Survey
    Dong, Jiong
    Ota, Kaoru
    Dong, Mianxiong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (02)
  • [4] Video Frame Interpolation with Flow Transformer
    Gao, Pan
    Tian, Haoyue
    Qin, Jie
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1933 - 1942
  • [5] A CONCATENATED MODEL FOR VIDEO FRAME INTERPOLATION
    Chen, Ying
    Smith, Mark J. T.
    2009 IEEE 13TH DIGITAL SIGNAL PROCESSING WORKSHOP & 5TH IEEE PROCESSING EDUCATION WORKSHOP, VOLS 1 AND 2, PROCEEDINGS, 2009, : 565 - 569
  • [6] Deep Bayesian Video Frame Interpolation
    Yu, Zhiyang
    Zhang, Yu
    Xiang, Xujie
    Zou, Dongqing
    Chen, Xijun
    Ren, Jimmy S.
    COMPUTER VISION - ECCV 2022, PT XV, 2022, 13675 : 144 - 160
  • [7] Hybrid Warping Fusion for Video Frame Interpolation
    Yu Li
    Ye Zhu
    Ruoteng Li
    Xintao Wang
    Yue Luo
    Ying Shan
    International Journal of Computer Vision, 2022, 130 : 2980 - 2993
  • [8] A comprehensive survey on video frame interpolation techniques
    Anil Singh Parihar
    Disha Varshney
    Kshitija Pandya
    Ashray Aggarwal
    The Visual Computer, 2022, 38 : 295 - 319
  • [9] Video Frame Interpolation With Learnable Uncertainty and Decomposition
    Yu, Zhiyang
    Chen, Xijun
    Ren, Shunqing
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2642 - 2646
  • [10] A SUBJECTIVE QUALITY STUDY FOR VIDEO FRAME INTERPOLATION
    Danier, Duolikun
    Zhang, Fan
    Bull, David
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1361 - 1365