ASVFI: AUDIO-DRIVEN SPEAKER VIDEO FRAME INTERPOLATION

被引:0
|
作者
Wang, Qianrui [1 ]
Li, Dengshi [1 ]
Liao, Liang [2 ]
Song, Hao [1 ]
Li, Wei [1 ]
Xiao, Jing [3 ]
机构
[1] Jianghan Univ, Sch Artificial Intelligence, Wuhan, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[3] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Wuhan, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年
关键词
Speaker video; video frame interpolation; audio;
D O I
10.1109/ICIP49359.2023.10222345
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to limited data transmission, the video frame rate is low during the online conference, severely affecting user experience. Video frame interpolation can solve the problem by interpolating intermediate frames to increase the video frame rate. Generally, most existing video frame interpolation methods are based on the linear motion assumption. However, the mouth motion is nonlinear, and these methods can not generate superior intermediate frames in speaker video. Considering the strong correlation between mouth shape and vocalization, a new method is proposed, named Audio-driven Speaker Video Frame Interpolation(ASVFI). First, we extract the audio feature from Audio Net(ANet). Second, we use Video Net(VNet) encoder to extract the video feature. Finally, we fuse the audio and video features by AVFusion and decode out the intermediate frame in the VNet decoder. The experimental results show that the PSNR is nearly 0.13dB higher than the baseline of interpolating one frame. When interpolating seven frames, the PSNR is 0.33dB higher than the baseline.
引用
收藏
页码:3200 / 3204
页数:5
相关论文
共 50 条
  • [41] Parallel Spatio-Temporal Attention Transformer for Video Frame Interpolation
    Ning, Xin
    Cai, Feifan
    Li, Yuhang
    Ding, Youdong
    ELECTRONICS, 2024, 13 (10)
  • [42] FLAVR: flow-free architecture for fast video frame interpolation
    Tarun Kalluri
    Deepak Pathak
    Manmohan Chandraker
    Du Tran
    Machine Vision and Applications, 2023, 34
  • [43] VIDEO FRAME INTERPOLATION VIA EXCEPTIONAL MOTION-AWARE SYNTHESIS
    Park, Minho
    Lee, Sangmin
    Ro, Yong Man
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1958 - 1962
  • [44] An Analytical Study of CNN-based Video Frame Interpolation Techniques
    Pandya, Kshitija
    Varshney, Disha
    Aggarwal, Ashray
    Parihar, Anil Singh
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 1124 - 1131
  • [45] Progressive Spatial-temporal Collaborative Network for Video Frame Interpolation
    Hu, Mengshun
    Jiang, Kui
    Liao, Liang
    Nie, Zhixiang
    Xiao, Jing
    Wang, Zheng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2145 - 2153
  • [46] EFFICIENT CONVOLUTION AND TRANSFORMER-BASED NETWORK FOR VIDEO FRAME INTERPOLATION
    Khalifeh, Issa
    Murn, Luka
    Mrak, Marta
    Izquierdo, Ebroul
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1050 - 1054
  • [47] Progressive Motion Context Refine Network for Efficient Video Frame Interpolation
    Kong, Lingtong
    Liu, Jinfeng
    Yang, Jie
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2338 - 2342
  • [48] Multi-scale Intermediate Flow Estimation for Video Frame Interpolation
    Fan, Zehua
    Zhu, Feng
    Li, Lei
    Tan, Xiaoyang
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 893 - 900
  • [49] Arbitrary Timestep Video Frame Interpolation with Time-Dependent Decoding
    Zhang, Haokai
    Ren, Dongwei
    Yan, Zifei
    Zuo, Wangmeng
    MATHEMATICS, 2024, 12 (02)
  • [50] Flow-aware synthesis: A generic motion model for video frame interpolation
    Jinbo Xing
    Wenbo Hu
    Yuechen Zhang
    Tien-Tsin Wong
    Computational Visual Media, 2021, 7 : 393 - 405