Structured Sparsity Learning for Efficient Video Super-Resolution

被引:18
作者
Xia, Bin [1 ]
He, Jingwen [2 ]
Zhang, Yulun [3 ]
Wang, Yitong [4 ]
Tian, Yapeng [5 ]
Yang, Wenming [1 ]
Van Gool, Luc [3 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] ByteDance Inc, Beijing, Peoples R China
[5] Univ Texas Dallas, Dallas, TX USA
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.02168
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The high computational costs of video super-resolution (VSR) models hinder their deployment on resource-limited devices, e.g., smartphones and drones. Existing VSR models contain considerable redundant filters, which drag down the inference efficiency. To prune these unimportant filters, we develop a structured pruning scheme called Structured Sparsity Learning (SSL) according to the properties of VSR. In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks. Specifically, we develop a Residual Sparsity Connection (RSC) scheme for residual blocks of recurrent networks to liberate pruning restrictions and preserve the restoration information. For upsampling networks, we design a pixel-shuffle pruning scheme to guarantee the accuracy of feature channel-space conversion. In addition, we observe that pruning error would be amplified as the hidden states propagate along with recurrent networks. To alleviate the issue, we design Temporal Finetuning (TF). Extensive experiments show that SSL can significantly outperform recent methods quantitatively and qualitatively. The code is available at https://github.com/Zj-BinXia/SSL.
引用
收藏
页码:22638 / 22647
页数:10
相关论文
共 56 条
[31]  
Liu Hongying, 2021, ARXIV210311744
[32]  
Liu ZW, 2017, IEEE I CONF COMP VIS, P4473, DOI [10.1109/ICCV.2017.478, 10.1109/ICCVW.2017.361]
[33]  
Loshchilov I., 2017, INT C LEARNING REPRE
[34]   NTIRE 2019 Challenge on Video Deblurring: Methods and Results [J].
Nah, Seungjun ;
Timofte, Radu ;
Baik, Sungyong ;
Hong, Seokil ;
Moon, Gyeongsik ;
Son, Sanghyun ;
Lee, Kyoung Mu ;
Wang, Xintao ;
Chan, Kelvin C. K. ;
Yu, Ke ;
Dong, Chao ;
Loy, Chen Change ;
Fan, Yuchen ;
Yu, Jiahui ;
Liu, Ding ;
Huang, Thomas S. ;
Sim, Hyeonjun ;
Kim, Munchurl ;
Park, Dongwon ;
Kim, Jisoo ;
Chun, Se Young ;
Haris, Muhammad ;
Shakhnarovich, Greg ;
Ukita, Norimichi ;
Zamir, Syed Waqas ;
Arora, Aditya ;
Khan, Salman ;
Khan, Fahad Shahbaz ;
Shao, Ling ;
Gupta, Rahul Kumar ;
Chudasama, Vishal ;
Patel, Heena ;
Upla, Kishor ;
Fan, Hongfei ;
Li, Guo ;
Zhang, Yumei ;
Li, Xiang ;
Zhang, Wenjie ;
He, Qingwen ;
Purohit, Kuldeep ;
Rajagopalan, A. N. ;
Kim, Jeonghun ;
Tofighi, Mohammad ;
Guo, Tiantong ;
Monga, Vishal .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :1974-1984
[35]  
Oh Junghun, 2022, CVPR
[36]  
Ranjan P, 2017, 2017 3RD INTERNATIONAL CONFERENCE ON CONDITION ASSESSMENT TECHNIQUES IN ELECTRICAL SYSTEMS (CATCON), P6, DOI 10.1109/CATCON.2017.8280174
[37]  
Reed R., 1993, IEEE T NEURAL NETWOR
[38]   Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network [J].
Shi, Wenzhe ;
Caballero, Jose ;
Huszar, Ferenc ;
Totz, Johannes ;
Aitken, Andrew P. ;
Bishop, Rob ;
Rueckert, Daniel ;
Wang, Zehan .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1874-1883
[39]  
Sze V., 2020, Synthesis Lectures on Computer Architecture, V15, P1, DOI [DOI 10.2200/S01004ED1V01Y202004CAC050, 10.2200/S01004ED1V01Y202004CAC050]
[40]   Detail-revealing Deep Video Super-resolution [J].
Tao, Xin ;
Gao, Hongyun ;
Liao, Renjie ;
Wang, Jue ;
Jia, Jiaya .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4482-4490