EVALUATING THE PERFORMANCE OF A LARGE LANGUAGE MODEL (LLM) COMPARED TO HUMANS IN A COMPLEX CATEGORIZATION TASK

被引:0
|
作者
Edema, C. [1 ]
Martin, A. [1 ]
Martin, C. [1 ]
Bertuzzi, A. [1 ]
King, E. [1 ]
Wesson, F. [1 ]
Witkowski, M. [1 ]
机构
[1] Crystallise, Stanford Hope, Essex, England
关键词
D O I
暂无
中图分类号
F [经济];
学科分类号
02 ;
摘要
MSR193
引用
收藏
页数:2
相关论文
共 50 条
  • [41] LLM-BRC: A large language model-based bug report classification framework
    Du, Xiaoting
    Liu, Zhihao
    Li, Chenglong
    Ma, Xiangyue
    Li, Yingzhuo
    Wang, Xinyu
    SOFTWARE QUALITY JOURNAL, 2024, 32 (03) : 985 - 1005
  • [42] LLM-TIKG: Threat intelligence knowledge graph construction utilizing large language model
    Hu, Yuelin
    Zou, Futai
    Han, Jiajia
    Sun, Xin
    Wang, Yilei
    COMPUTERS & SECURITY, 2024, 145
  • [43] Advancing Human-Robot Interaction Using AI - A Large Language Model (LLM) Approach
    Dimitropoulos, Nikos
    Papalexis, Pantelis
    Michalos, George
    Makris, Sotiris
    ADVANCES IN ARTIFICIAL INTELLIGENCE IN MANUFACTURING, ESAIM 2023, 2024, : 116 - 125
  • [44] Investigating large language model (LLM) performance using in-context learning (ICL) for interpretation of ESMO and NCCN guidelines for lung cancer
    Iivanainen, Sanna
    Lagus, Jarkko
    Viertolahti, Henri
    Sippola, Lauri
    Koivunen, Jussi
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
  • [45] Task Planning for a Factory Robot Using Large Language Model
    Tsushima, Yosuke
    Yamamoto, Shu
    Ravankar, Ankit A.
    Luces, Jose Victorio Salazar
    Hirata, Yasuhisa
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2383 - 2390
  • [46] The performance of large language model-powered chatbots compared to oncology physicians on colorectal cancer queries
    Zhou, Shan
    Luo, Xiao
    Chen, Chan
    Jiang, Hong
    Yang, Chun
    Ran, Guanghui
    Yu, Juan
    Yin, Chengliang
    INTERNATIONAL JOURNAL OF SURGERY, 2024, 110 (10) : 6509 - 6517
  • [47] Evaluating Large Language Models for Automated Reporting and Data Systems Categorization: Cross-Sectional Study
    Wu, Qingxia
    Li, Huali
    Wang, Yan
    Bai, Yan
    Wu, Yaping
    Yu, Xuan
    Li, Xiaodong
    Dong, Pei
    Xue, Jon
    Shen, Dinggang
    Wang, Meiyun
    JMIR MEDICAL INFORMATICS, 2024, 12
  • [48] CompressionGPT: Evaluating Fault Tolerance of a Compressed Large Language Model
    Kapur, Neil
    Rangel, America
    Pentecost, Lillian
    2023 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, IISWC, 2023, : 232 - 234
  • [49] Evaluating the diagnostic performance of a large language model-powered chatbot for providing immunohistochemistry recommendations in dermatopathology
    McCrary, Myles R.
    Galambus, Justine
    Chen, Wei-Shen
    JOURNAL OF CUTANEOUS PATHOLOGY, 2024, 51 (09) : 689 - 695
  • [50] Performance of a Large Language Model in Screening Citations
    Oami, Takehiko
    Okada, Yohei
    Nakada, Taka-aki
    JAMA NETWORK OPEN, 2024, 7 (07) : e2420496