Multi-GPU implementation of a VMAT treatment plan optimization algorithm

被引:10
|
作者
Tian, Zhen [1 ]
Peng, Fei [2 ]
Folkerts, Michael [1 ]
Tan, Jun [1 ]
Jia, Xun [1 ]
Jiang, Steve B. [1 ]
机构
[1] Univ Texas SW Med Ctr Dallas, Dept Radiat Oncol, Dallas, TX 75390 USA
[2] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
关键词
multi-GPU; VMAT optimization; column-generation approach; MODULATED ARC THERAPY; RADIOTHERAPY DOSE CALCULATION; TEMPORAL NONLOCAL MEANS; CONE-BEAM CT; RADIATION-THERAPY; IMRT; DELIVERY; TOMOTHERAPY; RECONSTRUCTION; QUALITY;
D O I
10.1118/1.4919742
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU's relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors' group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors' method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H&N) cancer case is then used to validate the authors' method. The authors also compare their multi-GPU implementation with three different single GPU implementation strategies, i.e., truncating DDC matrix (Si), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3), in terms of both plan quality and computational efficiency. Two more H&N patient cases and three prostate cases are used to demonstrate the advantages of the authors' method. Results: The authors' multi-GPU implementation can finish the optimization process within similar to 1 min for the H&N patient case. Si leads to an inferior plan quality although its total time was 10 s shorter than the multi-GPU implementation due to the reduced matrix size. S2 and S3 yield the same plan quality as the multi-GPU implementation but take similar to 4 and similar to 6 min, respectively. High computational efficiency was consistently achieved for the other five patient cases tested, with VMAT plans of clinically acceptable quality obtained within 23-46 s. Conversely, to obtain clinically comparable or acceptable plans for all six of these VMAT cases that the authors have tested in this paper, the optimization time needed in a commercial TPS system on CPU was found to be in an order of several minutes. Conclusions: The results demonstrate that the multi-GPU implementation of the authors' column-generation-based VMAT optimization can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality. The authors' study may serve as an example to shed some light on other large-scale medical physics problems that require multi-GPU techniques. (C) 2015 American Association of Physicists in Medicine.
引用
收藏
页码:2841 / 2852
页数:12
相关论文
共 50 条
  • [41] The effect of multi-leaf collimator leaf width on VMAT treatment plan quality
    Peiris, Gregory Sadharanu
    Whelan, Brendan
    Hardcastle, Nicholas
    Sheehy, Suzie Lynn
    JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2025,
  • [42] A Multi-GPU Parallel Genetic Algorithm For Large-Scale Vehicle Routing Problems
    Abdelatti, Marwan
    Sodhi, Manbir
    Sendag, Resit
    2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC), 2022,
  • [43] Financial applications on multi-CPU and multi-GPU architectures
    Emilio Castillo
    Cristóbal Camarero
    Ana Borrego
    Jose Luis Bosque
    The Journal of Supercomputing, 2015, 71 : 729 - 739
  • [44] Financial applications on multi-CPU and multi-GPU architectures
    Castillo, Emilio
    Camarero, Cristobal
    Borrego, Ana
    Luis Bosque, Jose
    JOURNAL OF SUPERCOMPUTING, 2015, 71 (02) : 729 - 739
  • [45] Noncoplanar VMAT for nasopharyngeal tumors: Plan quality versus treatment time
    Wild, Esther
    Bangert, Mark
    Nill, Simeon
    Oelfke, Uwe
    MEDICAL PHYSICS, 2015, 42 (05) : 2157 - 2168
  • [46] Multi-GPU Parallel Pipeline Rendering with Splitting Frame
    Zhang, Haitang
    Ma, Junchao
    Qiu, Zixia
    Yao, Junmei
    Al Sibahee, Mustafa A.
    Abduljabbar, Zaid Ameen
    Nyangaresi, Vincent Omollo
    ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT II, 2024, 14496 : 223 - 235
  • [47] Evaluation of treatment plan quality of IMRT and VMAT with and without flattening filter using Pareto optimal fronts
    Lechner, Wolfgang
    Kragl, Gabriele
    Georg, Dietmar
    RADIOTHERAPY AND ONCOLOGY, 2013, 109 (03) : 437 - 441
  • [48] A New Parallel Frequency-Domain Finite-Difference Algorithm Using Multi-GPU
    Wang, Yijing
    He, Xinbo
    Wei, Bin
    IEEE MICROWAVE AND WIRELESS TECHNOLOGY LETTERS, 2024, 34 (08): : 971 - 974
  • [49] GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers
    Jahanshahi, Ali
    Sabzi, Hadi Zamani
    Lau, Chester
    Wong, Daniel
    IEEE COMPUTER ARCHITECTURE LETTERS, 2020, 19 (02) : 139 - 142
  • [50] Gossip: Efficient Communication Primitives for Multi-GPU Systems
    Kobus, Robin
    Juenger, Daniel
    Hundt, Christian
    Schmidt, Bertil
    PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,