Optimization model for memory bandwidth usage in X-ray image enhancement

被引：0

作者：

Albers, Rob ^{[1
,2
]}

Suijs, Eric ^{[2
]}

de With, Peter H. N. ^{[1
,3
]}

机构：

[1] Eindhoven Univ Technol, Eindhoven, Netherlands

[2] Philips Med Syst CardioVasc X Ray, Best, Netherlands

[3] LogicaCMG Nederland BV, TSE3, Eindhoven, Netherlands

来源：

REAL-TIME IMAGE PROCESSING 2008 | 2008年 / 6811卷

关键词：

bandwidth reduction; low latency; multi-core; memory optimization; X-ray; image enhancement;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In Cardiovascular minimal invasive interventions, physicians require low-latency X-ray imaging applications, as their actions must be directly visible on the screen. The image-processing system should enable the simultaneous execution of a plurality of functions. Because dedicated hardware lacks flexibility, there is a growing interest in using off-the-shelf computer technology. Because memory bandwidth is a scarce parameter, we will focus on optimization methods for bandwidth reduction within multiprocessor systems at the chip level. We create a practical realistic model of required compute and memory bandwidth for a given set of image-processing functions. Similar modeling is applied for the available system resources. We concentrate in particular on X-ray image processing based on multi-resolution decomposition, noise reduction and image-enhancement techniques. We derive formulas for which we can optimize the mapping of the application onto processors, cache and memory for different configurations. The data-block granularity is matched to the memory hierarchy, so that caching will be optimized for low latency. More specifically, we exploit the locality of the signal-processing functions to streamline the memory communication. A substantial performance improvement is realized by a new memory-communication model that incorporates the data dependencies of the image-processing functions. Results show a memory-bandwidth reduction in the order of 60% and a latency reduction in the order of 30-60% compared to straightforward implementations.

引用

页数：12

共 27 条

[11] HRISTEA C, 1997, P ICS NOV, P1
[12] JASPERS E, 2003, THESIS
[13] A methodology for detailed performance modeling of reduction computations on SMP machines
Jin, RM
Agrawal, G
[J]. PERFORMANCE EVALUATION, 2005, 60 (1-4) : 73 - 105
[14] MARSAN MA, 1983, IEEE T COMPUT, V32, P60, DOI 10.1109/TC.1983.1676124
[15] MCCALPIN J, 2007, CT WATCH Q, V3, P18
[16] McVoy L, 1996, PROCEEDINGS OF THE USENIX 1996 ANNUAL TECHNICAL CONFERENCE, P279
[17] MULLER G, 2007, INCREMENTAL EXECUTIO
[18] NOORDERGRAAF L, 1999, P 1999 ACM IEEE C SU, P38
[19] SEINSTRA F, 2003, THESIS
[20] STEINSALTZ Y, 2006, HPEC2006

← 1 2 3 →