Leveraging Large Language Models for Decision Support in Personalized Oncology

被引:67
作者
Benary, Manuela [1 ,2 ,3 ,4 ]
Wang, Xing David [5 ]
Schmidt, Max [1 ,2 ,3 ,6 ]
Soll, Dominik [1 ,2 ,3 ,7 ]
Hilfenhaus, Georg [1 ,2 ,3 ,8 ]
Nassir, Mani [1 ,2 ,3 ,8 ]
Sigler, Christian [1 ,2 ,3 ]
Knoedler, Maren [1 ,2 ,3 ]
Keller, Ulrich [2 ,3 ,6 ,9 ,10 ]
Beule, Dieter [4 ]
Keilholz, Ulrich [1 ,2 ,3 ,9 ,10 ]
Leser, Ulf [5 ]
Rieke, Damian T. [1 ,2 ,3 ,6 ,9 ,10 ]
机构
[1] Charite Univ Med Berlin, Comprehens Canc Ctr, Charitepl 1, D-10117 Berlin, Germany
[2] Free Univ Berlin, Berlin, Germany
[3] Humboldt Univ, Berlin, Germany
[4] Charite Univ Med Berlin, Berlin Inst Hlth, Core Unit Bioinformat, Charitepl 1, Berlin, Germany
[5] Humboldt Univ, Knowledge Management Bioinformat, Berlin, Germany
[6] Charite Univ Med Berlin, Dept Hematol Oncol & Canc Immunol, Campus Benjamin Franklin, Berlin, Germany
[7] Charite Univ Med Berlin, Dept Endocrinol & Metab Dis, Berlin, Germany
[8] Charite Univ Med Berlin, Dept Hematol Oncol & Canc Immunol, Campus Charite Mitte, Berlin, Germany
[9] German Canc Consortium, Berlin, Germany
[10] German Canc Res Ctr, Partner Site Berlin, Berlin, Germany
关键词
EFFICACY;
D O I
10.1001/jamanetworkopen.2023.43689
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Importance Clinical interpretation of complex biomarkers for precision oncology currently requires manual investigations of previous studies and databases. Conversational large language models (LLMs) might be beneficial as automated tools for assisting clinical decision-making.Objective To assess performance and define their role using 4 recent LLMs as support tools for precision oncology.Design, Setting, and Participants This diagnostic study examined 10 fictional cases of patients with advanced cancer with genetic alterations. Each case was submitted to 4 different LLMs (ChatGPT, Galactica, Perplexity, and BioMedLM) and 1 expert physician to identify personalized treatment options in 2023. Treatment options were masked and presented to a molecular tumor board (MTB), whose members rated the likelihood of a treatment option coming from an LLM on a scale from 0 to 10 (0, extremely unlikely; 10, extremely likely) and decided whether the treatment option was clinically useful.Main Outcomes and Measures Number of treatment options, precision, recall, F1 score of LLMs compared with human experts, recognizability, and usefulness of recommendations.Results For 10 fictional cancer patients (4 with lung cancer, 6 with other; median [IQR] 3.5 [3.0-4.8] molecular alterations per patient), a median (IQR) number of 4.0 (4.0-4.0) compared with 3.0 (3.0-5.0), 7.5 (4.3-9.8), 11.5 (7.8-13.0), and 13.0 (11.3-21.5) treatment options each was identified by the human expert and 4 LLMs, respectively. When considering the expert as a criterion standard, LLM-proposed treatment options reached F1 scores of 0.04, 0.17, 0.14, and 0.19 across all patients combined. Combining treatment options from different LLMs allowed a precision of 0.29 and a recall of 0.29 for an F1 score of 0.29. LLM-generated treatment options were recognized as AI-generated with a median (IQR) 7.5 (5.3-9.0) points in contrast to 2.0 (1.0-3.0) points for manually annotated cases. A crucial reason for identifying AI-generated treatment options was insufficient accompanying evidence. For each patient, at least 1 LLM generated a treatment option that was considered helpful by MTB members. Two unique useful treatment options (including 1 unique treatment strategy) were identified only by LLM.Conclusions and Relevance In this diagnostic study, treatment options of LLMs in precision oncology did not reach the quality and credibility of human experts; however, they generated helpful ideas that might have complemented established procedures. Considering technological progress, LLMs could play an increasingly important role in assisting with screening and selecting relevant biomedical literature to support evidence-based, personalized treatment decisions.
引用
收藏
页数:11
相关论文
共 32 条
  • [1] Achiam OJ, 2023, Arxiv, DOI [arXiv:2303.08774, DOI 10.48550/ARXIV.2303.08774]
  • [2] Brown TB, 2020, ADV NEUR IN, V33
  • [3] A New Initiative on Precision Medicine
    Collins, Francis S.
    Varmus, Harold
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2015, 372 (09) : 793 - 795
  • [4] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [5] Efficacy of Selpercatinib in RET Fusion-Positive Non-Small-Cell Lung Cancer
    Drilon, A.
    Oxnard, G. R.
    Tan, D. S. W.
    Loong, H. H. F.
    Johnson, M.
    Gainor, J.
    McCoach, C. E.
    Gautschi, O.
    Besse, B.
    Cho, B. C.
    Peled, N.
    Weiss, J.
    Kim, Y. -J.
    Ohe, Y.
    Nishio, M.
    Park, K.
    Patel, J.
    Seto, T.
    Sakamoto, T.
    Rosen, E.
    Shah, M. H.
    Barlesi, F.
    Cassier, P. A.
    Bazhenova, L.
    De Braud, F.
    Garralda, E.
    Velcheti, V.
    Satouchi, M.
    Ohashi, K.
    Pennell, N. A.
    Reckamp, K. L.
    Dy, G. K.
    Wolf, J.
    Solomon, B.
    Falchook, G.
    Ebata, K.
    Nguyen, M.
    Nair, B.
    Zhu, E. Y.
    Yang, L.
    Huang, X.
    Olek, E.
    Rothenberg, S. M.
    Goto, K.
    Subbiah, V.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2020, 383 (09) : 813 - 824
  • [6] Efficacy of Larotrectinib in TRK Fusion-Positive Cancers in Adults and Children
    Drilon, A.
    Laetsch, T. W.
    Kummar, S.
    DuBois, S. G.
    Lassen, U. N.
    Demetri, G. D.
    Nathenson, M.
    Doebele, R. C.
    Farago, A. F.
    Pappo, A. S.
    Turpin, B.
    Dowlati, A.
    Brose, M. S.
    Mascarenhas, L.
    Federman, N.
    Berlin, J.
    El-Deiry, W. S.
    Baik, C.
    Deeken, J.
    Boni, V.
    Nagasubramanian, R.
    Taylor, M.
    Rudzinski, E. R.
    Meric-Bernstam, F.
    Sohal, D. P. S.
    Ma, P. C.
    Raez, L. E.
    Hechtman, J. F.
    Benayed, R.
    Ladanyi, M.
    Tuch, B. B.
    Ebata, K.
    Cruickshank, S.
    Ku, N. C.
    Cox, M. C.
    Hawkins, D. S.
    Hong, D. S.
    Hyman, D. M.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2018, 378 (08) : 731 - 739
  • [7] github, LLMs in PO GitHub page
  • [8] Google, Bard homepage
  • [9] ChatGPT in glioma adjuvant therapy decision making: ready to assume the role of a doctor in the tumour board?
    Haemmerli, Julien
    Sveikata, Lukas
    Nouri, Aria
    May, Adrien
    Egervari, Kristof
    Freyschlag, Christian
    Lobrinus, Johannes A.
    Migliorini, Denis
    Momjian, Shahan
    Sanda, Nicolae
    Schaller, Karl
    Tran, Sebastien
    Yeung, Jacky
    Bijlenga, Philippe
    [J]. BMJ HEALTH & CARE INFORMATICS, 2023, 30 (01)
  • [10] AI-Generated Medical Advice-GPT and Beyond
    Haupt, Claudia E.
    Marks, Mason
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2023, 329 (16): : 1349 - 1350