BiDKT: Deep Knowledge Tracing with BERT

被引：8

作者：

Tan, Weicong ^{[1
]}

Jin, Yuan ^{[1
]}

Liu, Ming ^{[2
]}

Zhang, He ^{[3
]}

机构：

[1] Monash Univ, Melbourne, Vic, Australia

[2] Deakin Univ, Geelong, Vic, Australia

[3] ZHONGTUKEXIN CO LTD, Beijing 100020, Peoples R China

来源：

AD HOC NETWORKS AND TOOLS FOR IT, ADHOCNETS 2021 | 2022年 / 428卷

关键词：

Educational data mining; Knowledge tracing; BERT; SYSTEM;

D O I：

10.1007/978-3-030-98005-4_19

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Deep knowledge Tracing is a family of deep learning models that aim to predict students' future correctness of responses for different subjects (to indicate whether they have mastered the subjects) based on their previous histories of interactions with the subjects. Early deep knowledge tracing models mostly rely on recurrent neural networks (RNNs) that can only learn from a uni-directional context from the response sequences during the model training. An alternative for learning from the context in both directions from those sequences is to use the bidirectional deep learning models. The most recent significant advance in this regard is BERT, a transformer-style bidirectional model, which has outperformed numerous RNN models on several NLP tasks. Therefore, we apply and adapt the BERT model to the deep knowledge tracing task, for which we propose the model BiDKT. It is trained under a masked correctness recovery task where the model predicts the correctness of a small percentage of randomly masked responses based on their bidirectional context in the sequences. We conducted experiments on several real-world knowledge tracing datasets and show that BiDKT can outperform some of the state-of-the-art approaches on predicting the correctness of future student responses for some of the datasets. We have also discussed the possible reasons why BiDKT has underperformed in certain scenarios. Finally, we study the impacts of several key components of BiDKT on its performance.

引用

页码：260 / 278

页数：19

共 30 条

[1]

Ba JimmyLei., 2016, CORR

[2]

Bull S, 2010, STUD COMPUT INTELL, V308, P301

[3] Heterogeneous Features Integration in Deep Knowledge Tracing [J].

Cheung, Lap Pong ;

Yang, Haiqin .

NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 :653-662

[4]

CORBETT AT, 1994, USER MODEL USER-ADAP, V4, P253, DOI 10.1007/BF01099821

[5]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[6] Addressing the assessment challenge with an online system that tutors as it assesses [J].

Feng, Mingyu ;

Heffernan, Neil ;

Koedinger, Kenneth .

USER MODELING AND USER-ADAPTED INTERACTION, 2009, 19 (03) :243-266

[7]

Galyardt A., 2015, Journal of Educational Data Mining, V7, P83

[8]

Gervet T., 2020, Journal of Educational Data Mining, V12, P31, DOI DOI 10.5281/ZENODO.4143614

[9] Context-Aware Attentive Knowledge Tracing [J].

Ghosh, Aritra ;

Heffernan, Neil ;

Lan, Andrew S. .

KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :2330-2339

[10]

Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1

← 1 2 3 →