Is ChatGPT a Competent Teacher? Systematic Evaluation of Large Language Models on the Competency Model

被引：0

作者：

Gong, Liuying ^{[1
]}

Chen, Jingyuan ^{[2
]}

Wu, Fei ^{[3
]}

机构：

[1] Zhejiang Univ, Sch Publ Affairs, Hangzhou 310058, Peoples R China

[2] Zhejiang Univ, Coll Educ, Hangzhou 310058, Peoples R China

[3] Zhejiang Univ, Coll Comp Sci, Hangzhou 310027, Peoples R China

来源：

IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES | 2025年 / 18卷

关键词：

Education; Ethics; Chatbots; Systematics; Standards; Security; Psychology; Large language models; Generative AI; Ciphers; Evaluation; large language models (LLMs); teacher competency; INTELLIGENCE; VALIDATION; OPTIMISM; LLM;

D O I：

10.1109/TLT.2025.3564177

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The capabilities of large language models (LLMs) in language comprehension, conversational interaction, and content generation have led to their widespread adoption across various educational stages and contexts. Given the fundamental role of education, concerns are rising about whether LLMs can serve as competent teachers. To address the challenge of comprehensively evaluating the competencies of LLMs as teachers, a systematic quantitative evaluation based on the competency model has emerged as a valuable approach. Our study, grounded in the teacher competency model and drawing from 14 existing scales, constructed an evaluation framework called TeacherComp. Based on TeacherComp, we evaluated six LLMs from OpenAI across four dimensions: knowledge, skills, values, and traits. Through comparisons between LLMs' responses and human norms, we found that: 1) with each successive update, LLMs have shown overall improvements in knowledge, while their skills dimension scores have increasingly aligned with human norms; 2) there are both commonalities and differences in the performance of various LLMs regarding values and traits. For instance, while they all tend to exhibit more negative traits than humans, their morals can vary; and 3) LLMs with reduced security, constructed using jailbreak techniques, exhibit values and traits more closely aligned with human norms. Building on these findings, we provided interpretations and suggestions for the application of LLMs in various educational contexts. Overall, this study helps teachers and students use LLMs in appropriate contexts and provides developers with guidance for future iterations, thereby advancing the role of LLMs in empowering education.

引用

页码：530 / 541

页数：12

共 91 条

[1] Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations [J].

Abu-Rasheed, Hasan ;

Weber, Christian ;

Fathi, Madjid .

2024 IEEE GLOBAL ENGINEERING EDUCATION CONFERENCE, EDUCON 2024, 2024,

[2] Using Benchmarking Infrastructure to Evaluate LLM Performance on CS Concept Inventories: Challenges, Opportunities, and Critiques [J].

Ali, Murtaza ;

Rao, Prerna ;

Mai, Yifan ;

Xie, Benjamin .

20TH ANNUAL ACM CONFERENCE ON INTERNATIONAL COMPUTING EDUCATION RESEARCH, ICER 2024, VOL 1, 2024, :452-468

[3] Transforming Education: A Comprehensive Review of Generative Artificial Intelligence in Educational Settings through Bibliometric and Content Analysis [J].

Bahroun, Zied ;

Anane, Chiraz ;

Ahmed, Vian ;

Zacca, Andrew .

SUSTAINABILITY, 2023, 15 (17)

[4] Investigating the Efficacy of ChatGPT-3.5 for Tutoring in Chinese Elementary Education Settings [J].

Bai, Yu ;

Li, Jun ;

Shen, Jun ;

Zhao, Liang .

IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2024, 17 :2156-2171

[5] Evaluation of an LLM-Powered Student Agent for Teacher Training [J].

Bhowmik, Saptarshi ;

West, Luke ;

Barrett, Alex ;

Zhang, Nuodi ;

Dai, Chih-Pu ;

Sokolikj, Zlatko ;

Southerland, Sherry ;

Yuan, Xin ;

Ke, Fengfeng .

TECHNOLOGY ENHANCED LEARNING FOR INCLUSIVE AND EQUITABLE QUALITY EDUCATION, PT II, EC-TEL 2024, 2024, 15160 :68-74

[6]

Board E., 2023, E-Educ. Res., V44, P5

[7]

Buchan H, 2014, J. Psychol. Res., V4, P823

[8]

Cao Z., 2021, Research on the relationship between competency and job performance of college teachers

[9] A Survey on Evaluation of Large Language Models [J].

Chang, Yupeng ;

Wang, Xu ;

Wang, Jindong ;

Wu, Yuan ;

Yang, Linyi ;

Zhu, Kaijie ;

Chen, Hao ;

Yi, Xiaoyuan ;

Wang, Cunxiang ;

Wang, Yidong ;

Ye, Wei ;

Zhang, Yue ;

Chang, Yi ;

Yu, Philip S. ;

Yang, Qiang ;

Xie, Xing .

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (03)

[10] ENHANCING CULTURAL INTELLIGENCE: THE ROLES OF IMPLICIT CULTURE BELIEFS AND ADJUSTMENT [J].

Chao, Melody Manchi ;

Takeuchi, Riki ;

Farh, Jiing-Lih .

PERSONNEL PSYCHOLOGY, 2017, 70 (01) :257-292

← 1 2 3 4 5 6 7 8 9 10 →