Merging Multiple Template Matching Predictions in Intra Coding with Attentive Convolutional Neural Network

被引：1

作者：

Wang, Qijun ^{[1
,2
,3
]}

Zheng, Guodong ^{[1
,3
]}

机构：

[1] Anhui Univ, Minist Educ, Key Lab Intelligent Comp & Signal Proc, Hefei, Peoples R China

[2] Anhui Univ, Informat Mat & Intelligent Sensing Lab Anhui Prov, Hefei, Peoples R China

[3] Anhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

关键词：

Intra coding; video compression; template matching; convolutional neural network; attention mechanism;

D O I：

10.1145/3474085.3475359

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In intra coding, template matching prediction is an effective method to reduce the non-local redundancy inside image content. However, the prediction indicated by the best template matching is not always the actually best prediction. To solve this problem, we propose a method, which merges multiple template matching predictions through a convolutional neural network with attention module. The convolutional neural network aims at exploring different combinations of the candidate template matching predictions, and the attention module focuses on determining the most significant prediction candidate. Besides, the spatial module in attention mechanism can be utilized to model the relationship between the original pixels in current block and the reconstructed pixels in adjacent regions (template). Compared to the directional intra prediction and traditional template matching prediction, our method can provide a unified framework to generate prediction with high accuracy. The experimental results show that, compared the averaging strategy, the BD-rate reductions can reach up to 4.7%, 5.5% and 18.3% on the classic standard sequences (classB-classF), SIQAD dataset (screen content), and Urban100 dataset (natural scenes) respectively, while the average bit rate saving are 0.5%, 2.7% and 1.8%, respectively.

引用

页码：1994 / 2001

页数：8

共 41 条

[1]

[Anonymous], 2001, VCEGM33

[2]

[Anonymous], 2016, ARXIV160806690

[3]

AVS Workgroup, 2014, AVS N2046 INF TECHN

[4]

Bellard Fabrice, BPG SPECIFICATION

[5]

Bossen Frank., 2011, Joint Collaborative Team on Video Coding (JCT-VC), JCTVC-F900

[6]

Bross B., 2019, Joint Video Experts Team (JVET) of ITU-T SG, V16, P3

[7]

Cao Y., 2019, IEEE INT CONF COMP V, DOI DOI 10.1109/ICCVW.2019.00246

[8] Compression Artifacts Reduction by a Deep Convolutional Network [J].

Dong, Chao ;

Deng, Yubin ;

Loy, Chen Change ;

Tang, Xiaoou .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :576-584

[9] Context-Adaptive Neural Network-Based Prediction for Image Compression [J].

Dumas, Thierry ;

Roumy, Aline ;

Guillemot, Christine .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :679-693

[10]

Efros A. A., 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision, P1033, DOI 10.1109/ICCV.1999.790383

← 1 2 3 4 5 →