Deformable Attention Network for Efficient Space-Time Video Super-Resolution

被引:0
作者
Wang, Hua [1 ,2 ]
Chamchong, Rapeeporn [1 ]
Chomphuwiset, Phatthanaphong [3 ]
Pawara, Pornntiwa [1 ]
机构
[1] Mahasarakham Univ, Fac Informat, Dept Comp Sci, Maha Sarakham, Thailand
[2] Putian Univ, New Engn Ind Coll, Putian, Peoples R China
[3] MQ Sq, Bangkok, Thailand
关键词
image enhancement; image processing; image resolution;
D O I
10.1049/ipr2.70026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Space-time video super-resolution (STVSR) aims to construct high space-time resolution video sequences from low frame rate and low-resolution video sequences. While recent STVSR works combine temporal interpolation and spatial super-resolution in a unified framework, they face challenges in computational complexity across both temporal and spatial dimensions, particularly in achieving accurate intermediate frame interpolation and efficient temporal information utilisation. To address these, we propose a deformable attention network for efficient STVSR. Specifically, we introduce a deformable interpolation block that employs hierarchical feature fusion to effectively handle complex inter-frame motions at multiple scales, enabling more accurate intermediate frame generation. To fully utilise temporal information, we design a temporal feature shuffle block (TFSB) to efficiently exchange complementary information across multiple frames. Additionally, we develop a motion feature enhancement block incorporating channel attention mechanism to selectively emphasise motion-related features, further boosting TFSB's effectiveness. Experimental results on benchmark datasets definitively demonstrate that our proposed method achieves competitive performance in STVSR tasks.
引用
收藏
页数:13
相关论文
共 59 条
  • [1] Rippel O., Nair S., Lew C., Branson S., Anderson A.G., Bourdev L., Learned Video Compression: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3454-3463, (2019)
  • [2] Wu D., Hou Y.T., Zhu W., Zhang Y.-Q., Peha J.M., Streaming Video Over the Internet: Approaches and Directions, IEEE Transactions on Circuits and Systems for Video Technology, 11, 3, pp. 282-300, (2001)
  • [3] Rangasamy K., As'ari M.A., Rahmad N.A., Ghazali N.F., Ismail S., Deep Learning in Sport Video Analysis: A Review, TELKOMNIKA (Telecommunication Computing Electronics and Control), 18, 4, pp. 1926-1933, (2020)
  • [4] Xue T., Chen B., Wu J., Wei D., Freeman W.T., Video Enhancement With Task-Oriented Flow, International Journal of Computer Vision, 127, 8, pp. 1106-1125, (2017)
  • [5] Sajjadi M.S., Vemulapalli R., Brown M., Frame-Recurrent Video Super-Resolution: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626-6634, (2018)
  • [6] Jo Y., Wug Oh S., Kang J., Joo Kim S., Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3224-3232, (2018)
  • [7] Haris M., Shakhnarovich G., Ukita N., Recurrent Back-Projection Network for Video Super-Resolution: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3897-3906, (2019)
  • [8] Wang X., Chan K.C., Yu K., Dong C., Change Loy C., EDVR: Video Restoration With Enhanced Deformable Convolutional Networks: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1954-1963, (2019)
  • [9] Tian Y., Zhang Y., Fu Y., Tdan C.X., Temporally-Deformable Alignment Network for Video Super-Resolution: Proceedings of the 2020 IEEE CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3357-3366, (2020)
  • [10] Niklaus S., Mai L., Liu F., Video Frame Interpolation via Adaptive Convolution: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670-679, (2017)