Gross failure rates and failure modes for a commercial AI-based auto-segmentation algorithm in head and neck cancer patients

被引:5
作者
Temple, Simon W. P. [1 ]
Rowbottom, Carl G. [1 ,2 ]
机构
[1] Clatterbridge Canc Ctr NHS Fdn Trust, Med Phys Dept, Liverpool, England
[2] Univ Liverpool, Dept Phys, Liverpool, England
来源
JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS | 2024年 / 25卷 / 06期
关键词
auto-segmentation; deep learning; failure modes; INTEROBSERVER VARIABILITY; DELINEATION; ORGANS; RISK; IMPLEMENTATION; ONCOLOGY; QUALITY;
D O I
10.1002/acm2.14273
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
PurposeArtificial intelligence (AI) based commercial software can be used to automatically delineate organs at risk (OAR), with potential for efficiency savings in the radiotherapy treatment planning pathway, and reduction of inter- and intra-observer variability. There has been little research investigating gross failure rates and failure modes of such systems.Method50 head and neck (H&N) patient data sets with "gold standard" contours were compared to AI-generated contours to produce expected mean and standard deviation values for the Dice Similarity Coefficient (DSC), for four common H&N OARs (brainstem, mandible, left and right parotid). An AI-based commercial system was applied to 500 H&N patients. AI-generated contours were compared to manual contours, outlined by an expert human, and a gross failure was set at three standard deviations below the expected mean DSC. Failures were inspected to assess reason for failure of the AI-based system with failures relating to suboptimal manual contouring censored. True failures were classified into 4 sub-types (setup position, anatomy, image artefacts and unknown).ResultsThere were 24 true failures of the AI-based commercial software, a gross failure rate of 1.2%. Fifteen failures were due to patient anatomy, four were due to dental image artefacts, three were due to patient position and two were unknown. True failure rates by OAR were 0.4% (brainstem), 2.2% (mandible), 1.4% (left parotid) and 0.8% (right parotid).ConclusionTrue failures of the AI-based system were predominantly associated with a non-standard element within the CT scan. It is likely that these non-standard elements were the reason for the gross failure, and suggests that patient datasets used to train the AI model did not contain sufficient heterogeneity of data. Regardless of the reasons for failure, the true failure rate for the AI-based system in the H&N region for the OARs investigated was low (similar to 1%).
引用
收藏
页数:10
相关论文
共 43 条
  • [1] Revolutionizing healthcare: the role of artificial intelligence in clinical practice
    Alowais, Shuroug A.
    Alghamdi, Sahar S.
    Alsuhebany, Nada
    Alqahtani, Tariq
    Alshaya, Abdulrahman I.
    Almohareb, Sumaya N.
    Aldairem, Atheer
    Alrashed, Mohammed
    Bin Saleh, Khalid
    Badreldin, Hisham A.
    Al Yami, Majed S.
    Al Harbi, Shmeylan
    Albekairy, Abdulkareem M.
    [J]. BMC MEDICAL EDUCATION, 2023, 23 (01)
  • [2] Machine learning applications in radiation oncology: Current use and needs to support clinical implementation
    Brouwer, Charlotte L.
    Dinkla, Anna M.
    Vandewinckele, Liesbeth
    Crijns, Wouter
    Claessens, Michael
    Verellen, Dirk
    van Elmpt, Wouter
    [J]. PHYSICS & IMAGING IN RADIATION ONCOLOGY, 2020, 16 : 144 - 148
  • [3] Assessment of manual adjustment performed in clinical practice following deep learning contouring for head and neck organs at risk in radiotherapy
    Brouwer, Charlotte L.
    Boukerroui, Djamal
    Oliveira, Jorge
    Looney, Padraig
    Steenbakkers, Roel J. H. M.
    Langendijk, Johannes A.
    Both, Stefan
    Gooding, Mark J.
    [J]. PHYSICS & IMAGING IN RADIATION ONCOLOGY, 2020, 16 : 54 - 60
  • [4] CT-based delineation of organs at risk in the head and neck region: DAHANCA, EORTC, GORTEC, HKNPCSG, NCIC CTG, NCRI, NRG Oncology and TROG consensus guidelines
    Brouwer, Charlotte L.
    Steenbakkers, Roel J. H. M.
    Bourhis, Jean
    Budach, Wilfried
    Grau, Cai
    Gregoire, Vincent
    van Herk, Marcel
    Lee, Anne
    Maingon, Philippe
    Nutting, Chris
    O'Sullivan, Brian
    Porceddu, Sandro V.
    Rosenthal, David I.
    Sijtsema, Nanna M.
    Langendijk, Johannes A.
    [J]. RADIOTHERAPY AND ONCOLOGY, 2015, 117 (01) : 83 - 90
  • [5] Inspection planning for mission-critical quality
    Burke, R
    [J]. IEMC'01: CHANGE MANAGEMENT AND THE NEW INDUSTRIAL REVOLUTION, PROCEEDINGS, 2001, : 329 - 334
  • [6] Advances in Auto-Segmentation
    Cardenas, Carlos E.
    Yang, Jinzhong
    Anderson, Brian M.
    Court, Laurence E.
    Brock, Kristy B.
    [J]. SEMINARS IN RADIATION ONCOLOGY, 2019, 29 (03) : 185 - 197
  • [7] Atlas-based automatic segmentation of head and neck organs at risk and nodal target volumes: a clinical validation
    Daisne, Jean-Francois
    Blumhofer, Andreas
    [J]. RADIATION ONCOLOGY, 2013, 8
  • [8] MEASURES OF THE AMOUNT OF ECOLOGIC ASSOCIATION BETWEEN SPECIES
    DICE, LR
    [J]. ECOLOGY, 1945, 26 (03) : 297 - 302
  • [9] Translation of AI into oncology clinical practice
    El Naqa, Issam
    Karolak, Aleksandra
    Luo, Yi
    Folio, Les
    Tarhini, Ahmad A.
    Rollison, Dana
    Parodi, Katia
    [J]. ONCOGENE, 2023, 42 (42) : 3089 - 3097
  • [10] Automatic treatment planning based on three-dimensional dose distribution predicted from deep learning technique
    Fan, Jiawei
    Wang, Jiazhou
    Chen, Zhi
    Hu, Chaosu
    Zhang, Zhen
    Hu, Weigang
    [J]. MEDICAL PHYSICS, 2019, 46 (01) : 370 - 381