Deformable Attention Network for Efficient Space-Time Video Super-Resolution

被引:0
作者
Wang, Hua [1 ,2 ]
Chamchong, Rapeeporn [1 ]
Chomphuwiset, Phatthanaphong [3 ]
Pawara, Pornntiwa [1 ]
机构
[1] Mahasarakham Univ, Fac Informat, Dept Comp Sci, Maha Sarakham, Thailand
[2] Putian Univ, New Engn Ind Coll, Putian, Peoples R China
[3] MQ Sq, Bangkok, Thailand
关键词
image enhancement; image processing; image resolution;
D O I
10.1049/ipr2.70026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Space-time video super-resolution (STVSR) aims to construct high space-time resolution video sequences from low frame rate and low-resolution video sequences. While recent STVSR works combine temporal interpolation and spatial super-resolution in a unified framework, they face challenges in computational complexity across both temporal and spatial dimensions, particularly in achieving accurate intermediate frame interpolation and efficient temporal information utilisation. To address these, we propose a deformable attention network for efficient STVSR. Specifically, we introduce a deformable interpolation block that employs hierarchical feature fusion to effectively handle complex inter-frame motions at multiple scales, enabling more accurate intermediate frame generation. To fully utilise temporal information, we design a temporal feature shuffle block (TFSB) to efficiently exchange complementary information across multiple frames. Additionally, we develop a motion feature enhancement block incorporating channel attention mechanism to selectively emphasise motion-related features, further boosting TFSB's effectiveness. Experimental results on benchmark datasets definitively demonstrate that our proposed method achieves competitive performance in STVSR tasks.
引用
收藏
页数:13
相关论文
共 59 条
  • [41] Li S., He F., Du B., Zhang L., Xu Y., Tao D., Fast Spatio-Temporal Residual Network for Video Super-Resolution: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10522-10531, (2019)
  • [42] Kim S.Y., Lim J., Na T., Kim M., Video Super-Resolution Based on 3D-CNNs With Consideration of Scene Change, 2019 IEEE International Conference on Image Processing (ICIP), pp. 2831-2835, (2019)
  • [43] Huang Y., Wang W., Wang L., Video Super-Resolution via Bidirectional Recurrent Convolutional Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 4, pp. 1015-1028, (2017)
  • [44] Zhu X., Li Z., Zhang X.-Y., Li C., Liu Y., Xue Z., Residual Invertible Spatio-Temporal Network for Video Super-Resolution: Proceedings of the AAAI Conference on Artificial Intelligence, 33, 1, pp. 5981-5988, (2019)
  • [45] Fuoli D., Gu S., Timofte R., Efficient Video Super-Resolution Through Recurrent Latent Space Propagation, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3476-3485, (2019)
  • [46] Cao J., Li Y., Zhang K., Van Gool L., Video Super-Resolution Transformer, (2021)
  • [47] Liang J., Cao J., Fan Y., Et al., VRT: A Video Restoration Transformer, IEEE Transactions on Image Processing, 33, pp. 2171-2182, (2024)
  • [48] Liang J., Fan Y., Xiang X., Et al., Recurrent Video Restoration Transformer With Guided Deformable Attention, Advances in Neural Information Processing Systems, 35, pp. 378-393, (2022)
  • [49] Shechtman E., Caspi Y., Irani M., Increasing Space-Time Resolution in Video: Proceedings of the 7th European Conference on Computer Vision–ECCV 2002, pp. 753-768, (2002)
  • [50] Mudenagudi U., Banerjee S., Kalra P.K., Space-Time Super-Resolution Using Graph-Cut Optimization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 5, pp. 995-1008, (2010)