Memory-Augmented Auto-Regressive Network for Frame Recurrent Inter Prediction

被引：0

作者：

Hu, Yuzhang ^{[1
]}

Xia, Sifeng ^{[1
]}

Yang, Wenhan ^{[1
]}

Liu, Jiaying ^{[1
]}

机构：

[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China

来源：

2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2020年

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

High Efficient Video Coding (HEVC); inter prediction; deep learning; Memory-Augmented Auto-Regressive Network;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Inter prediction is quite important for the modern codecs to remove temporal redundancy. In this paper, we make endeavors in generating artificial reference frames with previous reconstructed frames for inter prediction, to offer a better choice when the traditional block-wise motion estimation fails to find a good reference block. Long-term temporal dynamics are tracked during the whole coding process to generate more accurate and realistic artificial reference frames. Specifically, we propose a Memory-Augmented Auto-Regressive Network (MAAR-Net) for frame prediction in video coding. MAAR-Net regresses the current frame with two nearest frames via an auto-regressive (AR) model to better capture the main spatial and temporal structures. The AR regression coefficients are generated based on adjacent frame information as well as the long-term motion dynamics accumulated and propagated by a convolutional Long Short-Term Memory (LSTM). To generate the target frame with higher quality, a quality attention mechanism is introduced for the temporal regularization between different reconstructed frames. With the well-designed network, our method surpasses HEVC on average 4.0% BD-rate saving and up to 10.6% BD-rate saving for the luma component under the low-delay configuration.

引用

页数：5

共 24 条

[1]

[Anonymous], 2015, Deep generative image models using a laplacian pyramid of adversarial networks

[2] Deep Frame Prediction for Video Coding [J].

Choi, Hyomin ;

Bajic, Ivan V. .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) :1843-1855

[3]

Haub F., 2018, ARXIV PREPRINT ARXIV

[4] Progressive Spatial Recurrent Neural Network for Intra Prediction [J].

Hu, Yueyu ;

Yang, Wenhan ;

Li, Mading ;

Liu, Jiaying .

IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (12) :3024-3037

[5] Enhanced Intra Prediction with Recurrent Neural Network in Video Coding [J].

Hu, Yueyu ;

Yang, Wenhan ;

Xia, Sifeng ;

Cheng, Wen-Huang ;

Liu, Jiaying .

2018 DATA COMPRESSION CONFERENCE (DCC 2018), 2018, :413-413

[6]

King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001

[7]

Li M., 2018, ACM T MULTIM COMPUT, V15, P1551

[8] MARLow: A Joint Multiplanar Autoregressive and Low-Rank Approach for Image Completion [J].

Li, Mading ;

Liu, Jiaying ;

Xiong, Zhiwei ;

Sun, Xiaoyan ;

Guo, Zongming .

COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :819-834

[9] Adaptive General Scale Interpolation Based on Weighted Autoregressive Models [J].

Li, Mading ;

Liu, Jiaying ;

Ren, Jie ;

Guo, Zongming .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (02) :200-211

[10]

Lin J., 2018, PROC IEEE VISUAL COM, P1

← 1 2 3 →