ChatGPT and Bard Performance on the POSCOMP Exam

被引:0
|
作者
Saldanha, Mateus Santos [1 ]
Digiampietri, Luciano Antonio [1 ]
机构
[1] Univ Sao Paulo, Sao Paulo, SP, Brazil
来源
PROCEEDINGS OF THE 20TH BRAZILIAN SYMPOSIUM ON INFORMATIONS SYSTEMS, SBSI 2024 | 2024年
关键词
Large Language Model; ChatBot; Computer Science Examination; ChatGPT; Bard;
D O I
10.1145/3658271.3658320
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Performance of ChatGPT and Bard on the medical licensing examinations varies across different cultures: a comparison study
    Chen, Yikai
    Huang, Xiujie
    Yang, Fangjie
    Lin, Haiming
    Lin, Haoyu
    Zheng, Zhuoqun
    Liang, Qifeng
    Zhang, Jinhai
    Li, Xinxin
    BMC MEDICAL EDUCATION, 2024, 24 (01)
  • [32] Quantitative evaluation of ChatGPT versus Bard responses to anaesthesia-related queries
    Patnaik, Sourav S.
    Hoffmann, Ulrike
    BRITISH JOURNAL OF ANAESTHESIA, 2024, 132 (01) : 169 - 171
  • [33] A review on enhancing education with AI: exploring the potential of ChatGPT, Bard, and generative AI
    Anduamlak Abebe Fenta
    Discover Education, 4 (1):
  • [34] Building the ArabNER Corpus for Arabic Named Entity Recognition Using ChatGPT and Bard
    Mahdhaoui, Hassen
    Mars, Abdelkarim
    Zrigui, Mounir
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT I, ACIIDS 2024, 2024, 14795 : 159 - 170
  • [35] AI-Powered Renal Diet Support: Performance of ChatGPT, Bard AI, and Bing Chat
    Qarajeh, Ahmad
    Tangpanithandee, Supawit
    Thongprayoon, Charat
    Suppadungsuk, Supawadee
    Krisanapan, Pajaree
    Aiumtrakul, Noppawit
    Valencia, Oscar A. Garcia
    Miao, Jing
    Qureshi, Fawad
    Cheungpasitporn, Wisit
    CLINICS AND PRACTICE, 2023, 13 (05) : 1160 - 1172
  • [36] ChatGPT and Bard exhibit spontaneous citation fabrication during psychiatry literature search
    McGowan, Alessia
    Gui, Yunlai
    Dobbs, Matthew
    Shuster, Sophia
    Cotter, Matthew
    Selloni, Alexandria
    Goodman, Marianne
    Srivastava, Agrima
    Cecchi, Guillermo A.
    Corcoran, Cheryl M.
    PSYCHIATRY RESEARCH, 2023, 326
  • [37] News Verifiers Showdown: A Comparative Performance Evaluation of ChatGPT 3.5, ChatGPT 4.0, Bing AI, and Bard in News Fact-Checking
    Caramancion, Kevin Matthe
    2023 IEEE FUTURE NETWORKS WORLD FORUM, FNWF, 2024,
  • [38] Assessing ChatGPT's orthopedic in-service training exam performance and applicability in the field
    Jain, Neil
    Gottlich, Caleb
    Fisher, John
    Campano, Dominic
    Winston, Travis
    JOURNAL OF ORTHOPAEDIC SURGERY AND RESEARCH, 2024, 19 (01)
  • [39] Assessing ChatGPT’s orthopedic in-service training exam performance and applicability in the field
    Neil Jain
    Caleb Gottlich
    John Fisher
    Dominic Campano
    Travis Winston
    Journal of Orthopaedic Surgery and Research, 19
  • [40] Chatbots Put to the Test in Math and Logic Problems: A Comparison and Assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard
    Plevris, Vagelis
    Papazafeiropoulos, George
    Rios, Alejandro Jimenez
    AI, 2023, 4 (04) : 949 - 969