Mitigating Cognitive Biases in Clinical Decision-Making Through Multi-Agent Conversations Using Large Language Models: Simulation Study

被引:8
作者
Ke, Yuhe [1 ]
Yang, Rui
Lie, Sui An
Lim, Taylor Xin Yi
Ning, Yilin
Li, Irene
Abdullah, Hairil Rizal
Ting, Daniel Shu Wei [1 ]
Liu, Nan [1 ,2 ]
机构
[1] Duke NUS Med Sch, Ctr Quantitat Med, 8 Coll Rd, Singapore 169857, Singapore
[2] Natl Univ Singapore, Inst Data Sci, Singapore, Singapore
关键词
clinical decision-making; cognitive bias; generative artificial intelligence; large language model; multi-agent;
D O I
10.2196/59439
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patientoutcomes. Addressing these biases presents a formidable challenge in the medical field. Objective: This study aimed to explore the role of large language models (LLMs) in mitigating these biases through the use ofthe multi-agent framework. We simulate the clinical decision-making processes through multi-agent conversation and evaluateits efficacy in improving diagnostic accuracy compared with humans. Methods: A total of 16 published and unpublished case reports where cognitive biases have resulted in misdiagnoses wereidentified from the literature. In the multi-agent framework, we leveraged GPT-4 (OpenAI) to facilitate interactions amongdifferent simulated agents to replicate clinical team dynamics. Each agent was assigned a distinct role: (1) making the finaldiagnosis after considering the discussions, (2) acting as a devil's advocate to correct confirmation and anchoring biases, (3)serving as a field expert in the required medical subspecialty, (4) facilitating discussions to mitigate premature closure bias, and(5) recording and summarizing findings. We tested varying combinations of these agents within the framework to determinewhich configuration yielded the highest rate of correct final diagnoses. Each scenario was repeated 5 times for consistency. Theaccuracy of the initial diagnoses and the final differential diagnoses were evaluated, and comparisons with human-generatedanswers were made using the Fisher exact test. Results: A total of 240 responses were evaluated (3 different multi-agent frameworks). The initial diagnosis had an accuracyof 0% (0/80). However, following multi-agent discussions, the accuracy for the top 2 differential diagnoses increased to 76%(61/80) for the best-performing multi-agent framework (Framework 4-C). This was significantly higher compared with theaccuracy achieved by human evaluators (odds ratio 3.49; P=.002). Conclusions: The multi-agent framework demonstrated an ability to re-evaluate and correct misconceptions, even in scenarioswith misleading initial investigations. In addition, the LLM-driven, multi-agent conversation framework shows promise inenhancing diagnostic accuracy in diagnostically challenging medical scenarios
引用
收藏
页数:11
相关论文
共 52 条
[1]  
[Anonymous], 2023, ChatGPT
[2]  
[Anonymous], 2023, About us
[3]  
[Anonymous], 2024, gpt-4
[4]   Cognitive biases, environmental, patient and personal factors associated with critical care decision making: A scoping review [J].
Beldhuis, Iris E. ;
Marapin, Ramesh S. ;
Jiang, You Yuan ;
de Souza, Nadia F. Simoes ;
Georgiou, Artemis ;
Kaufmann, Thomas ;
Forte, Jose Castela ;
van der Horst, Iwan C. C. .
JOURNAL OF CRITICAL CARE, 2021, 64 :144-153
[5]   The Impact of Cognitive Biases on Professionals' Decision-Making: A Review of Four Occupational Areas [J].
Berthet, Vincent .
FRONTIERS IN PSYCHOLOGY, 2022, 12
[6]  
Birch Eleanor M, 2022, Cureus, V14, pe29881, DOI 10.7759/cureus.29881
[7]   Breaking Bias: The Role of Artificial Intelligence in Improving Clinical Decision-Making [J].
Brown, Chris ;
Nazeer, Rayiz ;
Gibbs, Austin ;
Le Page, Pierre ;
Mitchell, Andrew R. J. .
CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (03)
[8]   Addressing Cognitive Biases in Interpreting an Elevated Lactate in a Patient with Type 1 Diabetes and Thiamine Deficiency [J].
Chehayeb, Rachel Jaber ;
Ilagan-Ying, Ysabel C. ;
Sankey, Christopher .
JOURNAL OF GENERAL INTERNAL MEDICINE, 2023, 38 (06) :1547-1551
[9]   The future landscape of large language models in medicine [J].
Clusmann, Jan ;
Kolbinger, Fiona R. ;
Muti, Hannah Sophie ;
Carrero, Zunamys I. ;
Eckardt, Jan-Niklas ;
Laleh, Narmin Ghaffari ;
Loeffler, Chiara Maria Lavinia ;
Schwarzkopf, Sophie-Caroline ;
Unger, Michaela ;
Veldhuizen, Gregory P. ;
Wagner, Sophia J. ;
Kather, Jakob Nikolas .
COMMUNICATIONS MEDICINE, 2023, 3 (01)
[10]   Cognitive biases and knowledge deficits leading to delayed recognition of cryptococcal meningitis [J].
Deming, M. ;
Mark, A. ;
Nyemba, V ;
Heil, E. L. ;
Palmeiro, R. M. ;
Schmalzle, S. A. .
IDCASES, 2019, 18