ChatGPT and Bard Performance on the POSCOMP Exam

被引：0

作者：

Saldanha, Mateus Santos ^{[1
]}

Digiampietri, Luciano Antonio ^{[1
]}

机构：

[1] Univ Sao Paulo, Sao Paulo, SP, Brazil

来源：

PROCEEDINGS OF THE 20TH BRAZILIAN SYMPOSIUM ON INFORMATIONS SYSTEMS, SBSI 2024 | 2024年

关键词：

Large Language Model; ChatBot; Computer Science Examination; ChatGPT; Bard;

D O I：

10.1145/3658271.3658320

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.

引用

页数：10

共 50 条

[41] Performance of ChatGPT and GPT-4 on Polish National Specialty Exam (NSE) in Ophthalmology
Ciekalski, Marcin
Laskowski, Maciej
Koperczak, Agnieszka
Smierciak, Maria
Sirek, Sebastian
POSTEPY HIGIENY I MEDYCYNY DOSWIADCZALNEJ, 2024, 78 (01): : 111 - 116
[42] Does ChatGPT Pass the Brazilian Bar Exam?
Freitas, Pedro Miguel
Gomes, Luis Mendes
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT II, 2023, 14116 : 131 - 141
[43] THE ABILITY OF ARTIFICIAL INTELLIGENCE CHATBOTS ChatGPT AND GOOGLE BARD TO ACCURATELY CONVEY PREOPERATIVE INFORMATION FOR PATIENTS UNDERGOING OPHTHALMIC SURGERIES
Patil, Nikhil S.
Huang, Ryan
Mihalache, Andrew
Kisilevsky, Eli
Kwok, Jason
Popovic, Marko M.
Nassrallah, Georges
Chan, Clara
Mallipatna, Ashwin
Kertes, Peter J.
Muni, Rajeev H.
RETINA-THE JOURNAL OF RETINAL AND VITREOUS DISEASES, 2024, 44 (06): : 950 - 953
[44] Can ChatGPT pass the thoracic surgery exam?
Gencer, Adem
Aydin, Suphi
AMERICAN JOURNAL OF THE MEDICAL SCIENCES, 2023, 366 (04) : 291 - 295
[45] Leveraging ChatGPT and Bard: What does it convey for water treatment/desalination and harvesting sectors?
Ray, Saikat Sinha
Peddinti, Pranav R. T.
Verma, Rohith Kumar
Puppala, Harish
Kim, Byungmin
Singh, Ashutosh
Kwon, Young-Nam
DESALINATION, 2024, 570
[46] Assessing the Capability of ChatGPT, Google Bard, and Microsoft Bing in Solving Radiology Case Vignettes
Sarangi, Pradosh Kumar
Narayan, Ravi Kant
Mohakud, Sudipta
Vats, Aditi
Sahani, Debabrata
Mondal, Himel
INDIAN JOURNAL OF RADIOLOGY AND IMAGING, 2024, 34 (02) : 276 - 282
[47] Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard
Cheong, Ryan Chin Taw
Pang, Kenny Peter
Unadkat, Samit
Mcneillis, Venkata
Williamson, Andrew
Joseph, Jonathan
Randhawa, Premjit
Andrews, Peter
Paleri, Vinidh
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2024, 281 (04) : 2137 - 2143
[48] Quality of information about urologic pathology in English and Spanish from ChatGPT, BARD, and Copilot
Szczesniewski, J. J.
Alba, A. Ramoso
Castro, P. M. Rodriguez
Gomez, M. F. Lorenzo
Gonzalez, J. Sainz
Gonzalez, L. Llanes
ACTAS UROLOGICAS ESPANOLAS, 2024, 48 (05): : 398 - 403
[49] Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard
Ryan Chin Taw Cheong
Kenny Peter Pang
Samit Unadkat
Venkata Mcneillis
Andrew Williamson
Jonathan Joseph
Premjit Randhawa
Peter Andrews
Vinidh Paleri
European Archives of Oto-Rhino-Laryngology, 2024, 281 : 2137 - 2143
[50] Academic writing in the age of AI: Comparing the reliability of ChatGPT and Bard with Scopus and Web of Science
Garg, Swati
Ahmad, Asad
Madsen, Dag Oivind
JOURNAL OF INNOVATION & KNOWLEDGE, 2024, 9 (04):

← 1 2 3 4 5 →