Enhancing Orthopedic Knowledge Assessments: The Performance of Specialized Generative Language Model Optimization

被引:0
作者
Zhou, Hong [1 ,2 ]
Wang, Hong-lin [1 ,2 ]
Duan, Yu-yu [2 ,3 ]
Yan, Zi-neng [1 ,2 ]
Luo, Rui [1 ,2 ]
Lv, Xiang-xin [1 ,2 ]
Xie, Yi [1 ,2 ]
Zhang, Jia-yao [1 ,2 ]
Yang, Jia-ming [1 ,2 ]
Xue, Ming-di [1 ,2 ]
Fang, Ying [1 ,2 ]
Lu, Lin [2 ,4 ]
Liu, Peng-ran [1 ,2 ]
Ye, Zhe-wei [1 ,2 ]
机构
[1] Huazhong Univ Sci & Technol, Union Hosp, Tongji Med Coll, Dept Orthoped Surg, Wuhan 430022, Peoples R China
[2] Huazhong Univ Sci & Technol, Union Hosp, Tongji Med Coll, Lab Intelligent Med, Wuhan 430022, Peoples R China
[3] Hubei Univ Chinese Med, Coll Chinese Med, Wuhan 433065, Peoples R China
[4] Wuhan Univ, Dept Orthoped, Renmin Hosp, Wuhan 433060, Peoples R China
来源
CURRENT MEDICAL SCIENCE | 2024年
基金
中国国家自然科学基金;
关键词
artificial intelligence; large language models; generative articial intelligence; orthopedics; CLINICAL-PRACTICE GUIDELINE; AMERICAN ACADEMY; HIP-FRACTURES; MANAGEMENT;
D O I
10.1007/s11596-024-2929-4
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
ObjectiveThis study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models (LLMs) in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.MethodsThis research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons (AAOS) and authoritative orthopedic publications. A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge, disease diagnosis, fracture classification, treatment options, and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4, ChatGLM, and Spark LLM, with their generated responses recorded. The overall quality, accuracy, and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.ResultsCompared with their unoptimized LLMs, the optimized version of GPT-4 showed improvements of 15.3% in overall quality, 12.5% in accuracy, and 12.8% in comprehensiveness; ChatGLM showed improvements of 24.8%, 16.1%, and 19.6%, respectively; and Spark LLM showed improvements of 6.5%, 14.5%, and 24.7%, respectively.ConclusionThe optimization of knowledge bases significantly enhances the quality, accuracy, and comprehensiveness of the responses provided by the 3 models in the orthopedic field. Therefore, knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.
引用
收藏
页码:1001 / 1005
页数:5
相关论文
共 36 条
  • [1] ChatGPT in the world of medical research: From how it works to how to use it
    Blandchand, Florian
    Assefi, Mona
    Gatulle, Nicolas
    Constantin, Jean-Michel
    [J]. ANAESTHESIA CRITICAL CARE & PAIN MEDICINE, 2023, 42 (03)
  • [2] Centralized Database Access: Transformer Framework and LLM/Chatbot Integration-Based Hybrid Model
    Bratic, Diana
    Sapina, Marko
    Jurecic, Denis
    Grsic, Jana Ziljak
    [J]. APPLIED SYSTEM INNOVATION, 2024, 7 (01)
  • [3] The why and how our trauma patients die: A prospective Multicenter Western Trauma Association study
    Callcut, Rachael A.
    Kornblith, Lucy Z.
    Conroy, Amanda S.
    Robles, Anamaria J.
    Meizoso, Jonathan P.
    Namias, Nicholas
    Meyer, David E.
    Haymaker, Amanda
    Truitt, Michael S.
    Agrawal, Vaidehi
    Haan, James M.
    Lightwine, Kelly L.
    Porter, John M.
    San Roman, Janika L.
    Biffl, Walter L.
    Hayashi, Michael S.
    Sise, Michael J.
    Badiee, Jayraan
    Recinos, Gustavo
    Inaba, Kenji
    Schroeppel, Thomas J.
    Callaghan, Emma
    Dunn, Julie A.
    Godin, Samuel
    McIntyre, Robert C.
    Peltz, Erik D.
    O'Neill, Patrick J.
    Diven, Conrad F.
    Scifres, Aaron M.
    Switzer, Emily E.
    West, Michaela A.
    Storrs, Sarah
    Cullinane, Daniel C.
    Cordova, John F.
    Moore, Ernest E.
    Moore, Hunter B.
    Privette, Alicia R.
    Eriksson, Evert A.
    Cohen, Mitchell Jay
    Manning, Ronald J.
    Gutierrez, Tim
    Deramo, Paul
    Dunne, Casey E.
    Wong, Monica D.
    Krell, Regina V.
    Cross, Alisa M.
    Butler, Cressilee
    Moore, Cindy
    Rumford, Richelle
    [J]. JOURNAL OF TRAUMA AND ACUTE CARE SURGERY, 2019, 86 (05) : 864 - 870
  • [4] Centre NCG, 2016, Fractures (Non-Complex): Assessment and Management
  • [5] [车万翔 Che Wanxiang], 2023, [中国科学. 信息科学, Scientia Sinica Informationis], V53, P1645
  • [6] Automatic quadriceps and patellae segmentation of MRI with cascaded U2-Net and SASSNet deep learning model
    Cheng, Ruida
    Crouzier, Marion
    Hug, Francois
    Tucker, Kylie
    Juneau, Paul
    McCreedy, Evan
    Gandler, William
    McAuliffe, Matthew J.
    Sheehan, Frances T.
    [J]. MEDICAL PHYSICS, 2022, 49 (01) : 443 - 460
  • [7] Radiomics and Deep Learning for Disease Detection in Musculoskeletal Radiology An Overview of Novel MRI- and CT-Based Approaches
    Fritz, Benjamin
    Yi, Paul H. H.
    Kijowski, Richard
    Fritz, Jan
    [J]. INVESTIGATIVE RADIOLOGY, 2023, 58 (01) : 3 - 13
  • [8] Evaluating ChatGPT's Ability to Solve Higher-Order Questions on the Competency-Based Medical Education Curriculum in Medical Biochemistry
    Ghosh, Arindam
    Bir, Aritri
    [J]. CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (04)
  • [9] Giannos Panagiotis, 2023, JMIR Med Educ, V9, pe47737, DOI 10.2196/47737
  • [10] Enhancement of Detection of Diabetic Retinopathy Using Harris Hawks Optimization with Deep Learning Model
    Gundluru, Nagaraja
    Rajput, Dharmendra Singh
    Lakshmanna, Kuruva
    Kaluri, Rajesh
    Shorfuzzaman, Mohammad
    Uddin, Mueen
    Rahman Khan, Mohammad Arifin
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022