Do as AI say: susceptibility in deployment of clinical decision-aids

被引:250
作者
Gaube, Susanne [1 ,2 ]
Suresh, Harini [3 ]
Raue, Martina [2 ]
Merritt, Alexander [4 ]
Berkowitz, Seth J. [5 ]
Lermer, Eva [6 ,7 ]
Coughlin, Joseph F. [2 ]
Guttag, John V. [3 ]
Colak, Errol [8 ,9 ]
Ghassemi, Marzyeh [10 ,11 ,12 ]
机构
[1] Univ Regensburg, Dept Psychol, Regensburg, Germany
[2] MIT, MIT AgeLab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] MIT, MIT Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[4] Boston Med Ctr, Boston, MA USA
[5] Beth Israel Deaconess Med Ctr, Dept Radiol, 330 Brookline Ave, Boston, MA 02215 USA
[6] Ludwig Maximilians Univ Munchen, LMU Ctr Leadership & People Management, Munich, Germany
[7] FOM Univ Appl Sci Econ & Management, Munich, Germany
[8] St Michaels Hosp, Li Ka Shing Knowledge Inst, Toronto, ON, Canada
[9] Univ Toronto, Dept Med Imaging, Toronto, ON, Canada
[10] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
[11] Univ Toronto, Dept Med, Toronto, ON, Canada
[12] Vector Inst, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
CLASSIFICATION; ALGORITHM; INTELLIGENCE; PERFORMANCE; TRUST;
D O I
10.1038/s41746-021-00385-9
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Artificial intelligence (AI) models for decision support have been developed for clinical settings such as radiology, but little work evaluates the potential impact of such systems. In this study, physicians received chest X-rays and diagnostic advice, some of which was inaccurate, and were asked to evaluate advice quality and make diagnoses. All advice was generated by human experts, but some was labeled as coming from an AI system. As a group, radiologists rated advice as lower quality when it appeared to come from an AI system; physicians with less task-expertise did not. Diagnostic accuracy was significantly worse when participants received inaccurate advice, regardless of the purported source. This work raises important considerations for how advice, AI and non-AI, should be deployed in clinical environments.
引用
收藏
页数:8
相关论文
共 42 条
[1]   Effects of incorrect computer-aided detection (CAD) output on human decision-making in mammography [J].
Alberdi, E ;
Povyakalo, A ;
Strigini, L ;
Ayton, P .
ACADEMIC RADIOLOGY, 2004, 11 (08) :909-918
[2]  
Association of American Medical Colleges. Center for Workforce Studies, 2018, NPJ DIGIT MED, DOI [10.1038/s41746-021-00385-9, DOI 10.1038/S41746-021-00453-0]
[3]   A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy [J].
Beede, Emma ;
Baylor, Elizabeth ;
Hersch, Fred ;
Iurchenko, Anna ;
Wilcox, Lauren ;
Ruamviboonsuk, Paisan ;
Vardoulakis, Laura M. .
PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
[4]   Overtrust of Pediatric Health-Care Robots A Preliminary Survey of Parent Perspectives [J].
Borenstein, Jason ;
Wagner, Alan R. ;
Howard, Ayanna .
IEEE ROBOTICS & AUTOMATION MAGAZINE, 2018, 25 (01) :46-54
[5]   Computerised interpretation of fetal heart rate during labour (INFANT): a randomised controlled trial [J].
Brocklehurst, Peter ;
Johns, Nina ;
Johnston, Tracey ;
Barnfield, Gemma ;
Davies, Karen ;
Johnson, Mark ;
Patterson, Holly ;
Montague, Imogen ;
Watmore, Sally ;
Stolton, Alison ;
Parisaei, Maryam ;
McGhee, Natasha ;
Segovia, Silvia ;
Martindale, Elizabeth ;
Jackson, Hilary ;
Holleran, Josephine ;
Roberts, Devender ;
Holt, Siobhan ;
Dragovic, Bosko ;
Willmott-Powell, Miriam ;
Hutchinson, Laura ;
Toth, Benedek ;
Chandler, Gemma ;
Ridley, Suzanne ;
Bugg, George ;
Molnar, Anna ;
Lochrie, Denise ;
Connor, Jillian ;
Howe, David ;
Head, Katie ;
Wellstead, Sue ;
Mathers, Alan ;
Walker, Laura ;
Crawford, Isobel ;
Davies, David ;
Garner, Zoe ;
Galloway, Lucy ;
Bugg, George ;
Davies, Yvette ;
Smith, Carys ;
Perkins, Gill ;
Geary, Mike ;
Walsh, Fiona ;
Nagle, Ursula ;
Martindale, Elizabeth ;
Jackson, Hilary ;
O'Malley, Louise ;
Katakam, Narmada ;
White, Heather ;
Tanton, Emma .
LANCET, 2017, 389 (10080) :1719-1729
[6]   Bias in Radiology: The How and Why of Misses and Misinterpretations [J].
Busby, Lindsay P. ;
Courtier, Jesse L. ;
Glastonbury, Christine M. .
RADIOGRAPHICS, 2018, 38 (01) :236-247
[7]   The Role of Explanations on Trust and Reliance in Clinical Decision Support Systems [J].
Bussone, Adrian ;
Stumpf, Simone ;
O'Sullivan, Dympna .
2015 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2015), 2015, :160-169
[8]   Practice variation and practice guidelines: Attitudes of generalist and specialist physicians, nurse practitioners, and physician assistants [J].
Cook, David A. ;
Pencille, Laurie J. ;
Dupras, Denise M. ;
Linderbaum, Jane A. ;
Pankratz, V. Shane ;
Wilkinson, John M. .
PLOS ONE, 2018, 13 (01)
[9]   Machine intelligence in healthcare-perspectives on trustworthiness, explainability, usability, and transparency [J].
Cutillo, Christine M. ;
Sharma, Karlie R. ;
Foschini, Luca ;
Kundu, Shinjini ;
Mackintosh, Maxine ;
Mandl, Kenneth D. ;
Beck, Tyler ;
Collier, Elaine ;
Colvis, Christine ;
Gersing, Kenneth ;
Gordon, Valery ;
Jensen, Roxanne ;
Shabestari, Behrouz ;
Southall, Noel .
NPJ DIGITAL MEDICINE, 2020, 3 (01)
[10]   Pitfalls in Chest Radiographic Interpretation: Blind Spots [J].
de Groot, Patricia M. ;
Carter, Brett W. ;
Abbott, Gerald F. ;
Wu, Carol C. .
SEMINARS IN ROENTGENOLOGY, 2015, 50 (03) :197-209