ChatGPT and Bard Performance on the POSCOMP Exam

被引:0
|
作者
Saldanha, Mateus Santos [1 ]
Digiampietri, Luciano Antonio [1 ]
机构
[1] Univ Sao Paulo, Sao Paulo, SP, Brazil
来源
PROCEEDINGS OF THE 20TH BRAZILIAN SYMPOSIUM ON INFORMATIONS SYSTEMS, SBSI 2024 | 2024年
关键词
Large Language Model; ChatBot; Computer Science Examination; ChatGPT; Bard;
D O I
10.1145/3658271.3658320
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Performance of ChatGPT and Bard in self-assessment questions for nephrology board renewal
    Noda, Ryunosuke
    Izaki, Yuto
    Kitano, Fumiya
    Komatsu, Jun
    Ichikawa, Daisuke
    Shibagaki, Yugo
    CLINICAL AND EXPERIMENTAL NEPHROLOGY, 2024, 28 (05) : 465 - 469
  • [22] Performance of Google's Artificial Intelligence Chatbot "Bard" (Now "Gemini") on Ophthalmology Board Exam Practice Questions
    Botross, Monica
    Mohammadi, Seyed Omid
    Montgomery, Kendall
    Crawford, Courtney
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (03)
  • [23] The Performance of ChatGPT-4V in Interpreting Images and Tables in the Japanese Medical Licensing Exam
    Takagi, Soshi
    Koda, Masahide
    Watari, Takashi
    JMIR MEDICAL EDUCATION, 2024, 10
  • [24] Performance of ChatGPT-4 and Bard chatbots in responding to common patient questions on prostate cancer 177Lu-PSMA-617 therapy
    Bilgin, Gokce Belge
    Bilgin, Cem
    Childs, Daniel S.
    Orme, Jacob J.
    Burkett, Brian J.
    Packard, Ann T.
    Johnson, Derek R.
    Thorpe, Matthew P.
    Riaz, Irbaz Bin
    Halfdanarson, Thorvardur R.
    Johnson, Geoffrey B.
    Sartor, Oliver
    Kendi, Ayse Tuba
    FRONTIERS IN ONCOLOGY, 2024, 14
  • [25] ChatGPT performance on the American Shoulder and Elbow Surgeons maintenance of certification exam
    Fiedler, Benjamin
    Azua, Eric N.
    Phillips, Todd
    Ahmed, Adil Shahzad
    JOURNAL OF SHOULDER AND ELBOW SURGERY, 2024, 33 (09) : 1888 - 1893
  • [26] ChatGPT, Bard, and Large Language Models for Biomedical Research: Opportunities and Pitfalls
    Thapa, Surendrabikram
    Adhikari, Surabhi
    ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (12) : 2647 - 2651
  • [27] Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing
    Sumbal, Anusha
    Sumbal, Ramish
    Amir, Alina
    JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT, 2024, 11
  • [28] ChatGPT, Bard, and Large Language Models for Biomedical Research: Opportunities and Pitfalls
    Surendrabikram Thapa
    Surabhi Adhikari
    Annals of Biomedical Engineering, 2023, 51 : 2647 - 2651
  • [29] Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders
    Koga, Shunsuke
    Martin, Nicholas B.
    Dickson, Dennis W.
    BRAIN PATHOLOGY, 2024, 34 (03)
  • [30] Assessing the performance of ChatGPT and Bard/Gemini against radiologists for Prostate Imaging-Reporting and Data System classification based on prostate multiparametric MRI text reports
    Lee, Kang-Lung
    Kessler, Dimitri A.
    Caglic, Iztok
    Kuo, Yi-Hsin
    Shaida, Nadeem
    Barrett, Tristan
    BRITISH JOURNAL OF RADIOLOGY, 2024,