ChatGPT and Bard Performance on the POSCOMP Exam

被引：0

作者：

Saldanha, Mateus Santos ^{[1
]}

Digiampietri, Luciano Antonio ^{[1
]}

机构：

[1] Univ Sao Paulo, Sao Paulo, SP, Brazil

来源：

PROCEEDINGS OF THE 20TH BRAZILIAN SYMPOSIUM ON INFORMATIONS SYSTEMS, SBSI 2024 | 2024年

关键词：

Large Language Model; ChatBot; Computer Science Examination; ChatGPT; Bard;

D O I：

10.1145/3658271.3658320

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.

引用

页数：10

共 50 条

[1] Does Google’s Bard Chatbot perform better than ChatGPT on the European hand surgery exam?
Goetsch Thibaut
Armaghan Dabbagh
Philippe Liverneaux
International Orthopaedics, 2024, 48 : 151 - 158
[2] Does Google's Bard Chatbot perform better than ChatGPT on the European hand surgery exam?
Thibaut, Goetsch
Dabbagh, Armaghan
Liverneaux, Philippe
INTERNATIONAL ORTHOPAEDICS, 2023, 48 (1) : 151 - 158
[3] Comparative Performance of ChatGPT and Bard in a Text-Based Radiology Knowledge Assessment
Patil, Nikhil
Huang, Ryan
van der Pol, Christian
Larocque, Natasha
CANADIAN ASSOCIATION OF RADIOLOGISTS JOURNAL-JOURNAL DE L ASSOCIATION CANADIENNE DES RADIOLOGISTES, 2024, 75 (02): : 344 - 350
[4] A Comparative Analysis of ChatGPT, ChatGPT-4, and Google Bard Performances at the Advanced Burn Life Support Exam
Alessandri-Bonetti, Mario
Liu, Hilary Y.
Donovan, James M.
Ziembicki, Jenny A.
Egro, Francesco M.
JOURNAL OF BURN CARE & RESEARCH, 2024, 45 (04) : 945 - 948
[5] Performance Assessment of ChatGPT versus Bard in Detecting Alzheimer's Dementia
Balamurali, B. T.
Chen, Jer-Ming
DIAGNOSTICS, 2024, 14 (08)
[6] Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control
Wang, Yan
Liang, Lihua
Li, Ran
Wang, Yihua
Hao, Changfu
JOURNAL OF MULTIDISCIPLINARY HEALTHCARE, 2024, 17 : 3917 - 3929
[7] Performance assessment of ChatGPT 4, ChatGPT 3.5, Gemini Advanced Pro 1.5 and Bard 2.0 to problem solving in pathology in French language
Tarris, Georges
Martin, Laurent
DIGITAL HEALTH, 2025, 11
[8] ChatGPT’s performance on JSA-certified anesthesiologist exam
Michiko Kinoshita
Mizuki Komasaka
Katsuya Tanaka
Journal of Anesthesia, 2024, 38 : 282 - 283
[9] ChatGPT versus Bard: A comparative study
Ahmed, Imtiaz
Kajol, Mashrafi
Hasan, Uzma
Datta, Partha Protim
Roy, Ayon
Reza, Md. Rokonuzzaman
ENGINEERING REPORTS, 2024, 6 (11)
[10] ChatGPT's performance on JS']JSA-certified anesthesiologist exam
Kinoshita, Michiko
Komasaka, Mizuki
Tanaka, Katsuya
JOURNAL OF ANESTHESIA, 2024, 38 (02) : 282 - 283

← 1 2 3 4 5 →