Optimization of the Adaptive Computationally-Scalable Motion Estimation and Compensation for the Hardware H.264/AVC Encoder

被引:10
作者
Pastuszak, Grzegorz [1 ]
Jakubowski, Mariusz [2 ]
机构
[1] Warsaw Univ Technol, Inst Radioelect, Nowowiejska 15-19, PL-00665 Warsaw, Poland
[2] Acad Business & Finance VISTULA, Fac Engn, Stoklosy 3, PL-02787 Warsaw, Poland
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2016年 / 82卷 / 03期
关键词
Video coding; Motion estimation; H.264/AVC; FPGA; Very large-scale integration (VLSI); Architecture design; ARCHITECTURE DESIGN; ALGORITHM; CHIP;
D O I
10.1007/s11265-015-1021-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The adaptive computationally-scalable motion estimation algorithm and its hardware implementation allow the H.264/AVC encoder to achieve efficiencies close to optimal in real-time conditions. Particularly, the search algorithm achieves results close to optimum even if the number of search points assigned to macroblocks is strongly limited and varies with time. The architecture implementing the algorithm developed and reported previously takes at least 674 clock cycles to interpolate and load reference area, and the number cannot be decreased without decreasing the search range. This paper proposes some optimizations of the architecture to increase the maximal throughput achieved by the motion estimation system even four times. Firstly, the chroma interpolation follows the search process, whereas the luma interpolation precedes it. Secondly, the luma interpolator computes 128 instead of 64 samples per each clock cycle. Thirdly, the number of on-chip memories keeping interpolated reference area is increased accordingly to 128. Fourthly, some modules previously working at the base frequency are redesigned to operate at the doubled clock. Since the on-chip memories do not store fractional-pel chroma samples, their joint size is reduced from 160.44 to 104.44 kB. Additional savings in the memory size are achieved by the sequential processing of two reference-picture areas for each macroblock. The architecture is verified in the real-time FPGA hardware encoder. Synthesis results show that the updated architecture can support 2160p@30fps encoding for 0.13 mu m TSMC technology with a small increase in hardware resources and some losses in the compression efficiency. The efficiency is improved when processing smaller resolutions.
引用
收藏
页码:391 / 402
页数:12
相关论文
共 23 条
[1]  
[Anonymous], 1981, P NAT TEL C NEW ORL
[2]  
[Anonymous], 2013, 230082 ITUT ISOIEC M
[3]  
[Anonymous], VCEG 13 M
[4]  
[Anonymous], IEEE VEH TECHN C VTC
[5]   Design of integer motion estimator of HEVC for asymmetric motion-partitioning mode and 4K-UHD [J].
Byun, J. ;
Jung, Y. ;
Kim, J. .
ELECTRONICS LETTERS, 2013, 49 (18) :1142-1143
[6]   One-pass computation-aware motion estimation with adaptive search strategy [J].
Chen, Ching-Yeh ;
Huang, Yu-Wen ;
Lee, Chia-Lin ;
Chen, Liang-Gee .
IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (04) :698-706
[7]   A 212 MPixels/s 4096 x 2160p Multiview Video Encoder Chip for 3D/Quad Full HDTV Applications [J].
Ding, Li-Fu ;
Chen, Wei-Yin ;
Tsung, Pei-Kuei ;
Chuang, Tzu-Der ;
Hsiao, Pai-Heng ;
Chen, Yu-Han ;
Chiu, Hsu-Kuang ;
Chien, Shao-Yi ;
Chen, Liang-Gee .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2010, 45 (01) :46-58
[8]   Algorithm and Architecture Design of Bandwidth-Oriented Motion Estimation for Real-Time Mobile Video Applications [J].
Hsieh, Jui-Hung ;
Chang, Tian-Sheuan .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2013, 21 (01) :33-42
[9]   AN ADAPTIVE COMPUTATION-AWARE ALGORITHM FOR MULTI-FRAME VARIABLE BLOCK-SIZE MOTION ESTIMATION IN H.264/AVC [J].
Jakubowski, Mariusz ;
Pastuszak, Grzegorz .
SIGMAP 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2009, :122-+
[10]  
Lam CW, 2004, 2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 3, PROCEEDINGS, P729