Fast Online Video Super-Resolution with Deformable Attention Pyramid

被引:17
作者
Fuoli, Dario [1 ]
Danelljan, Martin [1 ]
Timofte, Radu [1 ,2 ]
Van Gool, Luc [1 ,3 ]
机构
[1] Swiss Fed Inst Technol, Comp Vis Lab, Zurich, Switzerland
[2] Univ Wurzburg, CAIDAS, Wurzburg, Germany
[3] Katholieke Univ Leuven, Leuven, Belgium
来源
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年
关键词
D O I
10.1109/WACV56688.2023.00178
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video super-resolution (VSR) has many applications that pose strict causal, real-time, and latency constraints, including video streaming and TV. We address the VSR problem under these settings, which poses additional important challenges since information from future frames is unavailable. Importantly, designing efficient, yet effective frame alignment and fusion modules remain central problems. In this work, we propose a recurrent VSR architecture based on a deformable attention pyramid (DAP). Our DAP aligns and integrates information from the recurrent state into the current frame prediction. To circumvent the computational cost of traditional attention-based methods, we only attend to a limited number of spatial locations, which are dynamically predicted by the DAP. Comprehensive experiments and analysis of the proposed key innovations show the effectiveness of our approach. We significantly reduce processing time and computational complexity in comparison to state-of-the-art methods, while maintaining a high performance. We surpass state-of-the-art method EDVR-M on two standard benchmarks with a speed-up of over 3x.
引用
收藏
页码:1735 / 1744
页数:10
相关论文
共 34 条
[1]  
[Anonymous], INT J COMPUTER VISIO
[2]  
[Anonymous], 2020, COMPUTER VISION E 10, DOI DOI 10.1109/ICDM50108.2020.00042
[3]   Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation [J].
Caballero, Jose ;
Ledig, Christian ;
Aitken, Andrew ;
Acosta, Alejandro ;
Totz, Johannes ;
Wang, Zehan ;
Shi, Wenzhe .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2848-2857
[4]  
Chan K. C.K., 2021, ARXIV210413371
[5]  
Chan Kelvin CK, 2020, ARXIV201202181
[6]   Memory Enhanced Global-Local Aggregation for Video Object Detection [J].
Chen, Yihong ;
Cao, Yue ;
Hu, Han ;
Wang, Liwei .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10334-10343
[7]  
Dosovitskiy A., 2020, ICLR 2021
[8]  
Fuoli Dario, 2019, ICCV WORKSH
[9]  
Fuoli Dario, 2019, ICCV WORKSH
[10]  
Haris Muhammad, 2019, P IEEE CVF C COMP VI