Multi-GPU implementation of a VMAT treatment plan optimization algorithm

被引：10

作者：

Tian, Zhen ^{[1
]}

Peng, Fei ^{[2
]}

Folkerts, Michael ^{[1
]}

Tan, Jun ^{[1
]}

Jia, Xun ^{[1
]}

Jiang, Steve B. ^{[1
]}

机构：

[1] Univ Texas SW Med Ctr Dallas, Dept Radiat Oncol, Dallas, TX 75390 USA

[2] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA

来源：

MEDICAL PHYSICS | 2015年 / 42卷 / 06期

关键词：

multi-GPU; VMAT optimization; column-generation approach; MODULATED ARC THERAPY; RADIOTHERAPY DOSE CALCULATION; TEMPORAL NONLOCAL MEANS; CONE-BEAM CT; RADIATION-THERAPY; IMRT; DELIVERY; TOMOTHERAPY; RECONSTRUCTION; QUALITY;

D O I：

10.1118/1.4919742

中图分类号：

R8 [特种医学]; R445 [影像诊断学];

学科分类号：

1002 ; 100207 ; 1009 ;

摘要：

Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU's relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors' group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors' method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H&N) cancer case is then used to validate the authors' method. The authors also compare their multi-GPU implementation with three different single GPU implementation strategies, i.e., truncating DDC matrix (Si), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3), in terms of both plan quality and computational efficiency. Two more H&N patient cases and three prostate cases are used to demonstrate the advantages of the authors' method. Results: The authors' multi-GPU implementation can finish the optimization process within similar to 1 min for the H&N patient case. Si leads to an inferior plan quality although its total time was 10 s shorter than the multi-GPU implementation due to the reduced matrix size. S2 and S3 yield the same plan quality as the multi-GPU implementation but take similar to 4 and similar to 6 min, respectively. High computational efficiency was consistently achieved for the other five patient cases tested, with VMAT plans of clinically acceptable quality obtained within 23-46 s. Conversely, to obtain clinically comparable or acceptable plans for all six of these VMAT cases that the authors have tested in this paper, the optimization time needed in a commercial TPS system on CPU was found to be in an order of several minutes. Conclusions: The results demonstrate that the multi-GPU implementation of the authors' column-generation-based VMAT optimization can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality. The authors' study may serve as an example to shed some light on other large-scale medical physics problems that require multi-GPU techniques. (C) 2015 American Association of Physicists in Medicine.

引用

页码：2841 / 2852

页数：12

共 50 条

[41] The effect of multi-leaf collimator leaf width on VMAT treatment plan quality
Peiris, Gregory Sadharanu
Whelan, Brendan
Hardcastle, Nicholas
Sheehy, Suzie Lynn
JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2025,
[42] A Multi-GPU Parallel Genetic Algorithm For Large-Scale Vehicle Routing Problems
Abdelatti, Marwan
Sodhi, Manbir
Sendag, Resit
2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC), 2022,
[43] Financial applications on multi-CPU and multi-GPU architectures
Emilio Castillo
Cristóbal Camarero
Ana Borrego
Jose Luis Bosque
The Journal of Supercomputing, 2015, 71 : 729 - 739
[44] Financial applications on multi-CPU and multi-GPU architectures
Castillo, Emilio
Camarero, Cristobal
Borrego, Ana
Luis Bosque, Jose
JOURNAL OF SUPERCOMPUTING, 2015, 71 (02) : 729 - 739
[45] Noncoplanar VMAT for nasopharyngeal tumors: Plan quality versus treatment time
Wild, Esther
Bangert, Mark
Nill, Simeon
Oelfke, Uwe
MEDICAL PHYSICS, 2015, 42 (05) : 2157 - 2168
[46] Multi-GPU Parallel Pipeline Rendering with Splitting Frame
Zhang, Haitang
Ma, Junchao
Qiu, Zixia
Yao, Junmei
Al Sibahee, Mustafa A.
Abduljabbar, Zaid Ameen
Nyangaresi, Vincent Omollo
ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT II, 2024, 14496 : 223 - 235
[47] Evaluation of treatment plan quality of IMRT and VMAT with and without flattening filter using Pareto optimal fronts
Lechner, Wolfgang
Kragl, Gabriele
Georg, Dietmar
RADIOTHERAPY AND ONCOLOGY, 2013, 109 (03) : 437 - 441
[48] A New Parallel Frequency-Domain Finite-Difference Algorithm Using Multi-GPU
Wang, Yijing
He, Xinbo
Wei, Bin
IEEE MICROWAVE AND WIRELESS TECHNOLOGY LETTERS, 2024, 34 (08): : 971 - 974
[49] GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers
Jahanshahi, Ali
Sabzi, Hadi Zamani
Lau, Chester
Wong, Daniel
IEEE COMPUTER ARCHITECTURE LETTERS, 2020, 19 (02) : 139 - 142
[50] Gossip: Efficient Communication Primitives for Multi-GPU Systems
Kobus, Robin
Juenger, Daniel
Hundt, Christian
Schmidt, Bertil
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,

← 1 2 3 4 5 →