ASVFI: AUDIO-DRIVEN SPEAKER VIDEO FRAME INTERPOLATION

被引:0
|
作者
Wang, Qianrui [1 ]
Li, Dengshi [1 ]
Liao, Liang [2 ]
Song, Hao [1 ]
Li, Wei [1 ]
Xiao, Jing [3 ]
机构
[1] Jianghan Univ, Sch Artificial Intelligence, Wuhan, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[3] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Wuhan, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年
关键词
Speaker video; video frame interpolation; audio;
D O I
10.1109/ICIP49359.2023.10222345
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to limited data transmission, the video frame rate is low during the online conference, severely affecting user experience. Video frame interpolation can solve the problem by interpolating intermediate frames to increase the video frame rate. Generally, most existing video frame interpolation methods are based on the linear motion assumption. However, the mouth motion is nonlinear, and these methods can not generate superior intermediate frames in speaker video. Considering the strong correlation between mouth shape and vocalization, a new method is proposed, named Audio-driven Speaker Video Frame Interpolation(ASVFI). First, we extract the audio feature from Audio Net(ANet). Second, we use Video Net(VNet) encoder to extract the video feature. Finally, we fuse the audio and video features by AVFusion and decode out the intermediate frame in the VNet decoder. The experimental results show that the PSNR is nearly 0.13dB higher than the baseline of interpolating one frame. When interpolating seven frames, the PSNR is 0.33dB higher than the baseline.
引用
收藏
页码:3200 / 3204
页数:5
相关论文
共 50 条
  • [31] A Motion Refinement Network With Local Compensation for Video Frame Interpolation
    Wang, Kaiqiao
    Liu, Peng
    IEEE ACCESS, 2023, 11 : 103092 - 103101
  • [32] Video Frame Interpolation and Enhancement via Pyramid Recurrent Framework
    Shen, Wang
    Bao, Wenbo
    Zhai, Guangtao
    Chen, Li
    Min, Xiongkuo
    Gao, Zhiyong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 277 - 292
  • [33] FID: Frame Interpolation and DCT-based Video Compression
    Jalalpour, Yeganeh
    Wang, Li-Yun
    Feng, Wu-chi
    Liu, Feng
    2020 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2020), 2020, : 218 - 221
  • [34] CONTINUOUS BIDIRECTIONAL OPTICAL FLOW FOR VIDEO FRAME SEQUENCE INTERPOLATION
    Gu, Donghao
    Wen, Zhaojing
    Cui, Wenxue
    Wang, Rui
    Jiang, Feng
    Liu, Shaohui
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1768 - 1773
  • [35] A Fast 4K Video Frame Interpolation based on StepWise Optical Flow Computation and Video Spatial Interpolation
    Jeong, Jinwoo
    Hong, Minsoo
    Kim, Je Woo
    Kim, Sungjei
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1140 - 1143
  • [36] TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation
    Liu, Chengxu
    Yang, Huan
    Fu, Jianlong
    Qian, Xueming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4728 - 4741
  • [37] How Video Super-Resolution and Frame Interpolation Mutually Benefit
    Zhou, Chengcheng
    Lu, Zongqing
    Li, Linge
    Yan, Qiangyu
    Xue, Jing-Hao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5445 - 5453
  • [38] Investigating Event-Based Cameras for Video Frame Interpolation in Sports
    Deckyvere, Antoine
    Cioppa, Anthony
    Giancola, Silvio
    Ghanem, Bernard
    Van Droogenbroeck, Marc
    2024 IEEE INTERNATIONAL WORKSHOP ON SPORT, TECHNOLOGY AND RESEARCH, STAR 2024, 2024, : 138 - 143
  • [39] A NOVEL ALL-IN-ONE GRID NETWORK FOR VIDEO FRAME INTERPOLATION
    Xue, Fanyong
    Li, Jie
    Wu, Chentao
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1969 - 1973
  • [40] FLAVR: flow-free architecture for fast video frame interpolation
    Kalluri, Tarun
    Pathak, Deepak
    Chandraker, Manmohan
    Tran, Du
    MACHINE VISION AND APPLICATIONS, 2023, 34 (05)