ISP: An Optimal Out-of-Core Image-Set Processing Streaming Architecture for Parallel Heterogeneous Systems

被引:3
作者
Ha, Linh Khanh [1 ]
Krueger, Jens [2 ]
Dihl Comba, Joao Luiz [3 ]
Silva, Claudio T. [4 ]
Joshi, Sarang [1 ]
机构
[1] Univ Utah, Sci Imaging & Comp Inst, 72 S Cent Campus Dr,WEB,Room 3692, Salt Lake City, UT 84112 USA
[2] Univ Saarland, D-66123 Saarbrucken, Germany
[3] Univ Fed Rio Grande do Sul, Inst Informat, BR-91501970 Porto Alegre, RS, Brazil
[4] NYU, Polytech Inst, Metrotech Ctr 6, Brooklyn, NY 11201 USA
基金
美国国家科学基金会;
关键词
GPUs; out-of-core processing; atlas construction; diffeomorphism; multiimage processing framework; COMPRESSION; COMPUTER;
D O I
10.1109/TVCG.2012.32
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Image population analysis is the class of statistical methods that plays a central role in understanding the development, evolution, and disease of a population. However, these techniques often require excessive computational power and memory that are compounded with a large number of volumetric inputs. Restricted access to supercomputing power limits its influence in general research and practical applications. In this paper we introduce ISP, an Image-Set Processing streaming framework that harnesses the processing power of commodity heterogeneous CPU/GPU systems and attempts to solve this computational problem. In ISP, we introduce specially designed streaming algorithms and data structures that provide an optimal solution for out-of-core multiimage processing problems both in terms of memory usage and computational efficiency. ISP makes use of the asynchronous execution mechanism supported by parallel heterogeneous systems to efficiently hide the inherent latency of the processing pipeline of out-of-core approaches. Consequently, with computationally intensive problems, the ISP out-of-core solution can achieve the same performance as the in-core solution. We demonstrate the efficiency of the ISP framework on synthetic and real datasets.
引用
收藏
页码:838 / 851
页数:14
相关论文
共 50 条
[1]  
ALATTAR AM, 1992, 1992 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-6, P1491, DOI 10.1109/ISCAS.1992.230218
[2]  
[Anonymous], 2007, Optimizing parallel reductions in CUDA
[3]   Coherent hierarchical culling: Hardware occlusion queries made useful [J].
Bittner, J ;
Wimmer, M ;
Piringer, H ;
Purgathofer, W .
COMPUTER GRAPHICS FORUM, 2004, 23 (03) :615-624
[4]  
Blelloch G.E., 2010, Introduction to Data Compression
[5]  
BORDAWEKAR R, 1995, SIGPLAN NOTICES, V30, P1, DOI 10.1145/209937.209938
[6]   CHEOPS - A RECONFIGURABLE DATA-FLOW SYSTEM FOR VIDEO PROCESSING [J].
BOVE, VM ;
WATLINGTON, JA .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 1995, 5 (02) :140-149
[7]  
Boyce J. M., 1992, ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech and Signal Processing (Cat. No.92CH3103-9), P461, DOI 10.1109/ICASSP.1992.226176
[8]   Compiler-based I/O prefetching for out-of-core applications [J].
Brown, AD ;
Mowry, TC ;
Krieger, O .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2001, 19 (02) :111-170
[9]  
Burtscher M, 2007, IEEE DATA COMPR CONF, P293
[10]  
Caron E., 2005, P IEEE 19 INT PAR DI, V01