A Patch Memory System For Image Processing and Computer Vision

被引:0
作者
Clemons, Jason [1 ]
Cheng, Chih-Chi [2 ]
Frosio, Iuri [1 ]
Johnson, Daniel [1 ]
Keckler, Stephen W. [1 ]
机构
[1] NVIDIA, Santa Clara, CA 95050 USA
[2] Qualcomm, Santa Clara, CA USA
来源
2016 49TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO) | 2016年
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
From self-driving cars to high dynamic range (HDR) imaging, the demand for image-based applications is growing quickly. In mobile systems, these applications place particular strain on performance and energy efficiency. As traditional memory systems are optimized for 1D memory access, they are unable to efficiently exploit the multi-dimensional locality characteristics of image-based applications which often operate on sub-regions of 2D and 3D image data. We have developed a new Patch Memory System (PMEM) tailored to application domains that process 2D and 3D data streams. PMEM supports efficient multidimensional addressing, automatic handling of image boundaries, and efficient caching and prefetching of image data. In addition to an optimized cache, PMEM includes hardware for offloading structured address calculations from processing units. We improve average energy-delay by 26% compared to EVA, a memory system for computer vision applications. Compared to a traditional cache, our results show that PMEM can reduce processor energy by 34% for a selection of CV and IP applications, leading to system performance improvement of up to 32% and energydelay product improvement of 48-86% on the applications in this study.
引用
收藏
页数:13
相关论文
共 40 条
[1]  
[Anonymous], IMAGE PROCESSING LIN
[2]  
[Anonymous], 2012, P IEEE COMP SOC C CO
[3]  
[Anonymous], TMS3206C64X DSP CACH
[4]  
Bojnordi MN, 2006, 2006 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, P1438
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   BRIEF: Binary Robust Independent Elementary Features [J].
Calonder, Michael ;
Lepetit, Vincent ;
Strecha, Christoph ;
Fua, Pascal .
COMPUTER VISION-ECCV 2010, PT IV, 2010, 6314 :778-792
[7]   EFFECTIVE HARDWARE-BASED DATA PREFETCHING FOR HIGH-PERFORMANCE PROCESSORS [J].
CHEN, TF ;
BAER, JL .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (05) :609-623
[8]   Fast algorithm and architecture design of low-power integer motion estimation for H.264/AVC [J].
Chen, Tung-Chien ;
Chen, Yu-Han ;
Tsai, Sung-Fang ;
Chien, Shao-Yi ;
Chen, Liang-Gee .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2007, 17 (05) :568-577
[9]  
Clemons J., 2013, INT C COMP ARCH SYNT, P1
[10]   Image denoising by sparse 3-D transform-domain collaborative filtering [J].
Dabov, Kostadin ;
Foi, Alessandro ;
Katkovnik, Vladimir ;
Egiazarian, Karen .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (08) :2080-2095