Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI

被引:45
作者
Vasey, Baptiste [1 ,2 ,3 ]
Nagendran, Myura [4 ]
Campbell, Bruce [5 ,6 ]
Clifton, David A. [2 ]
Collins, Gary S. [7 ]
Denaxas, Spiros [8 ,9 ,10 ,11 ]
Denniston, Alastair K. [12 ,13 ,14 ]
Faes, Livia [14 ]
Geerts, Bart [15 ]
Ibrahim, Mudathir [1 ,16 ]
Liu, Xiaoxuan [3 ,12 ]
Mateen, Bilal A. [8 ,17 ,18 ]
Mathur, Piyush [19 ]
McCradden, Melissa D. [20 ,21 ]
Morgan, Lauren [22 ]
Ordish, Johan [23 ]
Rogers, Campbell [24 ]
Saria, Suchi [25 ,26 ,27 ,28 ,29 ]
Ting, Daniel S. W. [30 ,31 ]
Watkinson, Peter [3 ,32 ]
Weber, Wim [33 ]
Wheatstone, Peter [34 ]
McCulloch, Peter [1 ]
机构
[1] Univ Oxford, Nuffield Dept Surg Sci, Oxford, England
[2] Univ Oxford, Inst Biomed Engn, Dept Engn Sci, Oxford, England
[3] Univ Oxford, Nuffield Dept Clin Neurosci, Crit Care Res Grp, Oxford, England
[4] Imperial Coll London, UKRI Ctr Doctoral Training AI Healthcare, London, England
[5] Univ Exeter, Med Sch, Exeter, Devon, England
[6] Royal Devon & Exeter Hosp, Exeter, Devon, England
[7] Univ Oxford, Ctr Stat Med, Nuffield Dept Orthopaed Rheumatol & Musculoskelet, Oxford, England
[8] UCL, Inst Hlth Informat, London, England
[9] British Heart Fdn Data Sci Ctr, London, England
[10] Hlth Data Res England, London, England
[11] UCL Hosp Biomed Res Ctr, London, England
[12] Univ Hosp Birmingham NHS Fdn Trust, Birmingham, W Midlands, England
[13] Univ Birmingham, Acad Unit Ophthalmol, Coll Med & Dent Sci, Inst Inflammat & Ageing, Birmingham, W Midlands, England
[14] Moorfields Eye Hosp NHS Fdn Trust, London, England
[15] Healthplusai R&D BV, Amsterdam, Netherlands
[16] Maimonides Hosp, Dept Surg, Brooklyn, NY 11219 USA
[17] Wellcome Trust Res Labs, London, England
[18] Alan Turing Inst, London, England
[19] Cleveland Clin, Dept Gen Anesthesiol, Anesthesiol Inst, Cleveland, OH 44106 USA
[20] Hosp Sick Children, Toronto, ON, Canada
[21] Univ Toronto, Dalla Lana Sch Publ Hlth, Toronto, ON, Canada
[22] Morgan Human Syst Ltd, Shrewsbury, Salop, England
[23] Med & Healthcare Prod Regulatory Agcy, London, England
[24] HeartFlow Inc, Redwood City, CA USA
[25] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
[26] Johns Hopkins Univ, Dept Stat, Baltimore, MD USA
[27] Johns Hopkins Univ, Dept Hlth Policy, Baltimore, MD USA
[28] Johns Hopkins Univ, Div Informat, Baltimore, MD USA
[29] Bayesian Hlth, New York, NY USA
[30] Singapore Eye Res Inst, Singapore Natl Eye Ctr, Singapore, Singapore
[31] Natl Univ Singapore, Duke NUS Med Sch, Singapore, Singapore
[32] Oxford Univ Hosp NHS Trust, NIHR Biomed Res Ctr Oxford, Oxford, England
[33] The BMJ, London, England
[34] Univ Leeds, Sch Med, Leeds, W Yorkshire, England
基金
英国医学研究理事会; 英国惠康基金; 美国国家卫生研究院; 英国工程与自然科学研究理事会; 美国国家科学基金会;
关键词
DELPHI; STATEMENT;
D O I
10.1038/s41591-022-01772-9
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The DECIDE-AI checklist, resulting from a multi-stakeholder group of experts in a Delphi process and following the EQUATOR Network's recommendations, includes key items that should be reported in early-stage clinical studies of AI-based decision support systems, to ensure a responsible and transparent deployment of AI systems in healthcare. A growing number of artificial intelligence (AI)-based clinical decision support systems are showing promising performance in preclinical, in silico evaluation, but few have yet demonstrated real benefit to patient care. Early-stage clinical evaluation is important to assess an AI system's actual clinical performance at small scale, ensure its safety, evaluate the human factors surrounding its use and pave the way to further large-scale trials. However, the reporting of these early studies remains inadequate. The present statement provides a multi-stakeholder, consensus-based reporting guideline for the Developmental and Exploratory Clinical Investigations of DEcision support systems driven by Artificial Intelligence (DECIDE-AI). We conducted a two-round, modified Delphi process to collect and analyze expert opinion on the reporting of early clinical evaluation of AI systems. Experts were recruited from 20 pre-defined stakeholder categories. The final composition and wording of the guideline was determined at a virtual consensus meeting. The checklist and the Explanation & Elaboration (E&E) sections were refined based on feedback from a qualitative evaluation process. In total, 123 experts participated in the first round of Delphi, 138 in the second round, 16 in the consensus meeting and 16 in the qualitative evaluation. The DECIDE-AI reporting guideline comprises 17 AI-specific reporting items (made of 28 subitems) and ten generic reporting items, with an E&E paragraph provided for each. Through consultation and consensus with a range of stakeholders, we developed a guideline comprising key items that should be reported in early-stage clinical studies of AI-based decision support systems in healthcare. By providing an actionable checklist of minimal reporting items, the DECIDE-AI guideline will facilitate the appraisal of these studies and replicability of their findings.
引用
收藏
页码:924 / +
页数:12
相关论文
共 61 条
[21]  
International Organization for Standardization, 2019, ERG HUM INT 210 HUM
[22]  
International Organization for Standardization, 2018, ERG HUM INT 11 US DE
[23]  
Kapur Narinder, 2016, JRSM Open, V7, p2054270415616548, DOI [10.1177/2054270415616548, 10.1177/2054270415616548]
[24]   With an eye to AI and autonomous diagnosis [J].
Keane, Pearse A. ;
Topol, Eric J. .
NPJ DIGITAL MEDICINE, 2018, 1
[25]  
Lim Sol, 2016, Proc Hum Factors Ergon Soc Annu Meet, V60, P970, DOI [10.1177/1541931213601224, 10.1177/1541931213601224]
[26]   The Mythos of Model Interpretability [J].
Lipton, Zachary C. .
COMMUNICATIONS OF THE ACM, 2018, 61 (10) :36-43
[27]   Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension [J].
Liu, Xiaoxuan ;
Rivera, Samantha Cruz ;
Moher, David ;
Calvert, Melanie J. ;
Denniston, Alastair K. .
NATURE MEDICINE, 2020, 26 (09) :1364-1374
[28]   A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis [J].
Liu, Xiaoxuan ;
Faes, Livia ;
Kale, Aditya U. ;
Wagner, Siegfried K. ;
Fu, Dun Jack ;
Bruynseels, Alice ;
Mahendiran, Thushika ;
Moraes, Gabriella ;
Shamdas, Mohith ;
Kern, Christoph ;
Ledsam, Joseph R. ;
Schmid, Martin K. ;
Balaskas, Konstantinos ;
Topol, Eric J. ;
Bachmann, Lucas M. ;
Keane, Pearse A. ;
Denniston, Alastair K. .
LANCET DIGITAL HEALTH, 2019, 1 (06) :E271-E297
[29]   Clinical research underlies ethical integration of healthcare artificial intelligence [J].
McCradden, Melissa D. ;
Stephenson, Elizabeth A. ;
Anderson, James A. .
NATURE MEDICINE, 2020, 26 (09) :1325-1326
[30]   Surgical Innovation and Evaluation 3 No surgical innovation without evaluation: the IDEAL recommendations [J].
McCulloch, Peter ;
Altman, Douglas G. ;
Campbell, W. Bruce ;
Flum, David R. ;
Glasziou, Paul ;
Marshall, John C. ;
Nicholl, Jon .
LANCET, 2009, 374 (9695) :1105-1112