Data clustering to select clinically-relevant test cases for algorithm benchmarking and characterization

被引:7
作者
Weppler, Sarah [1 ,2 ]
Schinkel, Colleen [2 ,3 ]
Kirkby, Charles [1 ,3 ,4 ]
Smith, Wendy [1 ,2 ,3 ]
机构
[1] Univ Calgary, Dept Phys & Astron, Calgary, AB T2N 1N4, Canada
[2] Tom Baker Canc Clin, Dept Med Phys, 1331 29 St NW, Calgary, AB T2N 4N2, Canada
[3] Univ Calgary, Dept Oncol, 2500 Univ Dr NW, Calgary, AB T2N 1N4, Canada
[4] Jack Ady Canc Ctr, Dept Med Phys, 960 19 St S, Lethbridge, AB T1J 1W5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
algorithm benchmarking; data clustering; test case selection; DEFORMABLE IMAGE REGISTRATION; RADIATION-THERAPY; RADIOTHERAPY; ACCURACY; PHANTOM; HEAD;
D O I
10.1088/1361-6560/ab6e54
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Algorithm benchmarking and characterization are an important part of algorithm development and validation prior to clinical implementation. However, benchmarking may be limited to a small collection of test cases due to the resource-intensive nature of establishing 'ground-truth' references. This study proposes a framework for selecting test cases to assess algorithm and workflow equivalence. Effective test case selection may minimize the number of ground-truth comparisons required to establish robust and clinically relevant benchmarking and characterization results. To demonstrate the proposed framework, we clustered differences between two independent workflows estimating during-treatment dose objective violations for 15 head and neck cancer patients (15 planning CTs, 105 on-unit CBCTs). Each workflow used a different deformable image registration algorithm to estimate inter-fractional anatomy and contour changes. The Hopkins statistic tested whether workflow output was inherently clustered and k-medoid clustering formalized cluster assignment. Further statistical analyses verified the relevance of clusters to algorithm output. Data at cluster centers ('medoids') were considered as candidate test cases representative of workflow-relevant algorithm differences. The framework indicated that differences in estimated dose objective violations were naturally grouped (Hopkins = 0.75, providing 90% confidence). K-medoid clustering identified five clusters which stratified workflow differences (MANOVA: p < 0.001) in estimated parotid gland D50%, spinal cord/brainstem Dmax, and high dose CTV coverage dose violations (Kendall's tau: p < 0.05). Systematic algorithm differences resulting in workflow discrepancies were: parotid gland volumes (ANOVA: p < 0.001), external contour deformations (t-test: p = 0.022), and CTV-to-PTV margins (t-test: 0.009), respectively. Five candidate test cases were verified as representative of the five clusters. The framework successfully clustered workflow outputs and identified five test cases representative of clinically relevant algorithm discrepancies. This approach may improve the allocation of resources during the benchmarking and characterization process and the applicability of results to clinical data.
引用
收藏
页数:12
相关论文
共 25 条
[1]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]   Use of image registration and fusion algorithms and techniques in radiotherapy: Report of the AAPM Radiation Therapy Committee Task [J].
Brock, Kristy K. ;
Mutic, Sasa ;
McNutt, Todd R. ;
Li, Hua ;
Kessler, Marc L. .
MEDICAL PHYSICS, 2017, 44 (07) :E43-E76
[4]   Comparative Analysis of MIM and Velocity's Image Deformation Algorithm Using Simulated KV-CBCT Images for Quality Assurance [J].
Cline, K. ;
Narayanasamy, G. ;
Obediat, M. ;
Stanley, D. ;
Stathakis, S. ;
Kim, H. ;
Kirby, N. .
MEDICAL PHYSICS, 2015, 42 (06) :3284-3284
[5]   Use of a realistic breathing lung phantom to evaluate dose delivery errors [J].
Court, Laurence E. ;
Seco, Joao ;
Lu, Xing-Qi ;
Ebe, Kazuyu ;
Mayo, Charles ;
Ionascu, Dan ;
Winey, Brian ;
Giakoumakis, Nikos ;
Aristophanous, Michalis ;
Berbeco, Ross ;
Rottman, Joerg ;
Bogdanov, Madeleine ;
Schofield, Deborah ;
Lingos, Tania .
MEDICAL PHYSICS, 2010, 37 (11) :5850-5857
[6]   Patient specific 3D printed phantom for IMRT quality assurance [J].
Ehler, Eric D. ;
Barney, Brett M. ;
Higgins, Patrick D. ;
Dusenbery, Kathryn E. .
PHYSICS IN MEDICINE AND BIOLOGY, 2014, 59 (19) :5763-5773
[7]   Dosimetric verification and clinical evaluation of a new commercially available Monte Carlo-based dose algorithm for application in stereotactic body radiation therapy (SBRT) treatment planning [J].
Fragoso, Margarida ;
Wen, Ning ;
Kumar, Sanath ;
Liu, Dezhi ;
Ryu, Samuel ;
Movsas, Benjamin ;
Munther, Ajlouni ;
Chetty, Indrin J. .
PHYSICS IN MEDICINE AND BIOLOGY, 2010, 55 (16) :4445-4464
[8]  
Han J, 2012, MOR KAUF D, P1
[9]   Test-case prioritization: achievements and challenges [J].
Hao, Dan ;
Zhang, Lu ;
Mei, Hong .
FRONTIERS OF COMPUTER SCIENCE, 2016, 10 (05) :769-777
[10]  
Hastie T., 2009, ELEMENTS STAT LEARNI