Optimization model for memory bandwidth usage in X-ray image enhancement

被引:0
作者
Albers, Rob [1 ,2 ]
Suijs, Eric [2 ]
de With, Peter H. N. [1 ,3 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
[2] Philips Med Syst CardioVasc X Ray, Best, Netherlands
[3] LogicaCMG Nederland BV, TSE3, Eindhoven, Netherlands
来源
REAL-TIME IMAGE PROCESSING 2008 | 2008年 / 6811卷
关键词
bandwidth reduction; low latency; multi-core; memory optimization; X-ray; image enhancement;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In Cardiovascular minimal invasive interventions, physicians require low-latency X-ray imaging applications, as their actions must be directly visible on the screen. The image-processing system should enable the simultaneous execution of a plurality of functions. Because dedicated hardware lacks flexibility, there is a growing interest in using off-the-shelf computer technology. Because memory bandwidth is a scarce parameter, we will focus on optimization methods for bandwidth reduction within multiprocessor systems at the chip level. We create a practical realistic model of required compute and memory bandwidth for a given set of image-processing functions. Similar modeling is applied for the available system resources. We concentrate in particular on X-ray image processing based on multi-resolution decomposition, noise reduction and image-enhancement techniques. We derive formulas for which we can optimize the mapping of the application onto processors, cache and memory for different configurations. The data-block granularity is matched to the memory hierarchy, so that caching will be optimized for low latency. More specifically, we exploit the locality of the signal-processing functions to streamline the memory communication. A substantial performance improvement is realized by a new memory-communication model that incorporates the data dependencies of the image-processing functions. Results show a memory-bandwidth reduction in the order of 60% and a latency reduction in the order of 30-60% compared to straightforward implementations.
引用
收藏
页数:12
相关论文
共 27 条
  • [11] HRISTEA C, 1997, P ICS NOV, P1
  • [12] JASPERS E, 2003, THESIS
  • [13] A methodology for detailed performance modeling of reduction computations on SMP machines
    Jin, RM
    Agrawal, G
    [J]. PERFORMANCE EVALUATION, 2005, 60 (1-4) : 73 - 105
  • [14] MARSAN MA, 1983, IEEE T COMPUT, V32, P60, DOI 10.1109/TC.1983.1676124
  • [15] MCCALPIN J, 2007, CT WATCH Q, V3, P18
  • [16] McVoy L, 1996, PROCEEDINGS OF THE USENIX 1996 ANNUAL TECHNICAL CONFERENCE, P279
  • [17] MULLER G, 2007, INCREMENTAL EXECUTIO
  • [18] NOORDERGRAAF L, 1999, P 1999 ACM IEEE C SU, P38
  • [19] SEINSTRA F, 2003, THESIS
  • [20] STEINSALTZ Y, 2006, HPEC2006