Video Compressed Sensing Using a Convolutional Neural Network

被引:41
作者
Shi, Wuzhen [1 ,2 ]
Liu, Shaohui [1 ,2 ]
Jiang, Feng [1 ,2 ]
Zhao, Debin [1 ,2 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China
基金
美国国家科学基金会;
关键词
Image reconstruction; Correlation; Compressed sensing; Convolutional neural networks; Computer architecture; Video sequences; Machine learning; video compressed sensing; video reconstruction; multilevel feature compensation; convolutional neural network;
D O I
10.1109/TCSVT.2020.2978703
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, a few image compressed sensing (CS) methods based on deep learning have been developed, which achieve remarkable reconstruction quality with low computational complexity. However, these existing deep learning-based image CS methods focus on exploring intraframe correlation while ignoring interframe cues, resulting in inefficiency when directly applied to video CS. In this paper, we propose a novel video CS framework based on a convolutional neural network (dubbed VCSNet) to explore both intraframe and interframe correlations. Specifically, VCSNet divides the video sequence into multiple groups of pictures (GOPs), of which the first frame is a keyframe that is sampled at a higher sampling ratio than the other nonkeyframes. In a GOP, the block-based framewise sampling by a convolution layer is proposed, which leads to the sampling matrix being automatically optimized. In the reconstruction process, the framewise initial reconstruction by using a linear convolutional neural network is first presented, which effectively utilizes the intraframe correlation. Then, the deep reconstruction with multilevel feature compensation is proposed, which compensates the nonkeyframes with the keyframe in a multilevel feature compensation manner. Such multilevel feature compensation allows the network to better explore both intraframe and interframe correlations. Extensive experiments on six benchmark videos show that VCSNet provides better performance over state-of-the-art video CS methods and deep learning-based image CS methods in both objective and subjective reconstruction quality.
引用
收藏
页码:425 / 438
页数:14
相关论文
共 38 条
[1]  
[Anonymous], 2010, ICML
[2]   Contour Detection and Hierarchical Image Segmentation [J].
Arbelaez, Pablo ;
Maire, Michael ;
Fowlkes, Charless ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (05) :898-916
[3]  
Bo L., 2017, P 9 INT C WIR COMM S, P1
[4]   Distributed optimization and statistical learning via the alternating direction method of multipliers [J].
Boyd S. ;
Parikh N. ;
Chu E. ;
Peleato B. ;
Eckstein J. .
Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122
[5]   Near-optimal signal recovery from random projections: Universal encoding strategies? [J].
Candes, Emmanuel J. ;
Tao, Terence .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (12) :5406-5425
[6]  
Chen C, 2011, CONF REC ASILOMAR C, P1193, DOI 10.1109/ACSSC.2011.6190204
[7]   Compressed sensing [J].
Donoho, DL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (04) :1289-1306
[8]   Single-pixel imaging via compressive sampling [J].
Duarte, Marco F. ;
Davenport, Mark A. ;
Takhar, Dharmpal ;
Laska, Jason N. ;
Sun, Ting ;
Kelly, Kevin F. ;
Baraniuk, Richard G. .
IEEE SIGNAL PROCESSING MAGAZINE, 2008, 25 (02) :83-91
[9]   Block-Based Compressed Sensing of Images and Video [J].
Fowler, James E. ;
Mun, Sungkwang ;
Tramel, Eric W. .
FOUNDATIONS AND TRENDS IN SIGNAL PROCESSING, 2010, 4 (04) :297-416
[10]  
Gan L, 2007, PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, P403