Student Mastery or AI Deception? Analyzing ChatGPT's Assessment Proficiency and Evaluating Detection Strategies

被引：0

作者：

Wang, Kevin ^{[1
]}

Akins, Seth ^{[1
]}

Mohammed, Abdallah ^{[1
]}

Lawrence, Ramon ^{[1
]}

机构：

[1] Univ British Columbia, Dept Comp Sci, Kelowna, BC V1V 2Z3, Canada

来源：

2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023 | 2023年

关键词：

ChatGPT; generative AI; performance; detection; plagarism; CS1; CS2; database;

D O I：

10.1109/CSCI62032.2023.00268

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generative Al systems such as ChatGPT have a disruptive effect on learning and assessment. Computer science requires practice to develop skills in problem solving and programming that are traditionally developed using assignments. Generative Al has the capability of completing these assignments for students with high accuracy, which dramatically increases the potential for academic integrity issues and students not achieving desired learning outcomes. This work investigates the performance of ChatGPT by evaluating it across three courses (CS1,CS2,databases). ChatGPT completes almost all introductory assessments perfectly. Existing detection methods, such as MOSS and JPlag (based on similarity metrics) and GPTzero (AI detection), have mixed success in identifying AI solutions. Evaluating instructors and teaching assistants using heuristics to distinguish between student and Al code shows that their detection is not. sufficiently accurate. These observations emphasize the need for adapting assessments and improved detection methods.

引用

页码：1615 / 1621

页数：7

共 12 条

[1] Analyzing student prompts and their effect on ChatGPT's performance
Sawalha, Ghadeer
Taj, Imran
Shoufan, Abdulhadi
COGENT EDUCATION, 2024, 11 (01):
[2] "Chatting with ChatGPT": Analyzing the factors influencing users' intention to Use the Open AI's ChatGPT using the UTAUT model
Menon, Devadas
Shilpa, K.
HELIYON, 2023, 9 (11)
[3] Evaluating ChatGPT’s Proficiency in Understanding and Answering Microservice Architecture Queries Using Source Code Insights
Quevedo E.
Abdelfattah A.S.
Rodriguez A.
Yero J.
Cerny T.
SN Computer Science, 5 (4)
[4] Evaluating ChatGPT's Efficacy in Pediatric Pneumonia Detection From Chest X-Rays: Comparative Analysis of Specialized AI Models
Chetla, Nitin
Tandon, Mihir
Chang, Joseph
Sukhija, Kunal
Patel, Romil
Sanchez, Ramon
JMIR AI, 2025, 4
[5] An Empirical Study Evaluating ChatGPT's Performance in Generating Search Strategies for Systematic Reviews
Yu, Fei
Kincaide, Heather
Carlson, Rebecca Beth
Proceedings of the Association for Information Science and Technology, 2024, 61 (01) : 423 - 434
[6] Learning based on human preferences: A pilot study regarding the student's perception of the AI and the use of ChatGPT
Dumitrescu, Dalina
INTERACCION Y PERSPECTIVA, 2024, 14 (03):
[7] Ensuring academic integrity in the age of ChatGPT : Rethinking exam design, assessment strategies, and ethical AI policies in higher education
Evangelista, Edmund De Leon
CONTEMPORARY EDUCATIONAL TECHNOLOGY, 2025, 17 (01)
[8] AI-Supported Academic Advising: Exploring ChatGPT's Current State and Future Potential toward Student Empowerment
Akiba, Daisuke
Fraboni, Michelle C.
EDUCATION SCIENCES, 2023, 13 (09):
[9] Artificial Intelligence in Medical Imaging: Analyzing the Performance of ChatGPT and Microsoft Bing in Scoliosis Detection and Cobb Angle Assessment
Fabijan, Artur
Zawadzka-Fabijan, Agnieszka
Fabijan, Robert
Zakrzewski, Krzysztof
Nowoslawska, Emilia
Polis, Bartosz
DIAGNOSTICS, 2024, 14 (07)
[10] Evaluating the efficacy of major language models in providing guidance for hand trauma nerve laceration patients: a case study on Google's AI BARD, Bing AI, and ChatGPT
Lim, Bryan
Seth, Ishith
Bulloch, Gabriella
Xie, Yi
Hunter-Smith, David J.
Rozen, Warren M.
PLASTIC AND AESTHETIC RESEARCH, 2023, 10

← 1 2 →