BackgroundFormulating a clinically acceptable plan within the time-constrained clinical setting of brachytherapy poses challenges to clinicians. Deep learning based dose prediction methods have shown favorable solutions for enhancing efficiency, but development has primarily been on external beam radiation therapy. Thus, there is a need for translation to brachytherapy.PurposeThis study proposes a dose prediction model utilizing an attention-gating mechanism and a 3D UNET for cervical cancer high-dose-rate intracavitary brachytherapy treatment planning with tandem-and-ovoid/ring applicators.MethodsA multi-institutional data set consisting of 77 retrospective clinical brachytherapy plans was utilized in this study. The data were preprocessed and augmented to increase the number of plans to 252. A 3D UNET architecture with attention gates was constructed and trained for mapping the contour information to dose distribution. The trained model was evaluated on a testing data set using various metrics, including dose statistics and dose-volume indices. We also trained a baseline UNET model for a fair comparison.ResultsThe attention-gated 3D UNET model exhibited competitive accuracy in predicting dose distributions similar to the ground truth. The average values of the mean absolute errors were 0.46 +/- 11.71 Gy (vs. 0.47 +/- 9.16 Gy for a baseline UNET) in CTVHR, 0.55 +/- 0.67 Gy (vs. 0.70 +/- 1.54 Gy for a baseline UNET) in bladder, 0.42 +/- 0.46 Gy (vs. 0.49 +/- 1.34 Gy for a baseline UNET) in rectum, and 0.31 +/- 0.65 Gy (vs. 0.20 +/- 3.76 Gy for a baseline UNET) in sigmoid. Our results showed that the mean individual differences in Delta D2cc for bladder, rectum, and sigmoid were 0.38 +/- 1.19 (p = 0.50), 0.43 +/- 0.71 (p = 0.41), and -0.47 +/- 0.79 (p = 0.30) Gy, respectively. Similarly, the mean individual differences in Delta D1cc for bladder, rectum, and sigmoid were 0.09 +/- 1.21 (p = 0.36), 0.20 +/- 0.95 (p = 0.24), and -0.21 +/- 0.59 (p = 0.30) Gy. The mean individual differences for Delta D90, Delta V100%, Delta V150%, and Delta V200% of the CTVHR were -0.45 +/- 2.42 (p = 0.26) Gy, 0.55 +/- 9.42% (p = 0.78), 0.82 +/- 4.21% (p = 0.81), and -0.80 +/- 10.48% (p = 0.36), respectively. The model requires less than 5 s to predict a full 3D dose distribution for a new patient plan.ConclusionAttention-gated 3D UNET revealed a promising capability in predicting voxel-wise dose distributions compared to 3D UNET. This model could be deployed for clinical use to predict 3D dose distributions for near real-time decision-making before planning, quality assurance, and guiding future automated planning, making the current workflow more efficient.