Predicting Course Performance on a Massive Open Online Course Platform: A Natural Language Processing Approach

被引：0

作者：

Alphenaar, Grant ^{[1
]}

Ibn Rafiq, Rahat ^{[1
]}

机构：

[1] Grand Valley State Univ, Allendale, MI 49401 USA

来源：

INFORMATION MANAGEMENT AND BIG DATA, SIMBIG 2023 | 2024年 / 2142卷

关键词：

Natural Language Processing; MOOC; Education Data Mining; Data Analysis; PRODUCT;

D O I：

10.1007/978-3-031-63616-5_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Massively open online courses (MOOCs) and platforms such as Udemy have proliferated in recent years. These courses run the gamut from highly successful and high-rated to courses with very low ratings and little engagement. This research aims to address the challenge of preemptively identifying potentially low-rated courses by leveraging instructor-provided textual information. Our approach involves a twostage process. First, we employ transformer-based Large Language Models (LLMs) to extract semantic information from the text provided by the instructors on the Udemy platform. In the second stage, we incorporate the extracted information as additional features into an upstream predictive model. To the best of our knowledge, this is the first attempt to use extracted semantic information from MOOC courses as features in a predictive model. In general, we find that existing consumer research findings hold and identify three key takeaways. First, we find that an instructor's prior performance is a strong indicator of future ratings. Second, we see that including semantic information contained in instructor-provided text can have an additive effect on model performance. Finally, we demonstrate that fine-tuning language models on Udemy-specific text have an appreciable positive effect on upstream model performance.

引用

页码：199 / 216

页数：18

共 37 条

[1]

[Anonymous], Udemy: Learn about Udemy culture, mission, and careers-About Us

[2]

Bentejac C., 2019, A Comparative Analysis of XGBoost, DOI 10.1007s10462-020-09896-5

[3] A comparative analysis of gradient boosting algorithms [J].

Bentejac, Candice ;

Csorgo, Anna ;

Martinez-Munoz, Gonzalo .

ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (03) :1937-1967

[4] XGBoost: A Scalable Tree Boosting System [J].

Chen, Tianqi ;

Guestrin, Carlos .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794

[5] Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks [J].

Choi, Hyunjin ;

Kim, Judong ;

Joe, Seongho ;

Gwon, Youngjune .

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :5482-5487

[6] A review on massive e-learning (MOOC) design, delivery and assessment [J].

Daradoumis, Thanasis ;

Bassi, Roxana ;

Xhafa, Fatos ;

Caballe, Santi .

2013 EIGHTH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC 2013), 2013, :208-213

[7]

Devlin J, 2019, Arxiv, DOI arXiv:1810.04805

[8] Unpacking MOOC scholarly discourse: a review of nascent MOOC scholarship [J].

Ebben, Maureen ;

Murphy, Julien S. .

LEARNING MEDIA AND TECHNOLOGY, 2014, 39 (03) :328-345

[9] Greedy function approximation: A gradient boosting machine [J].

Friedman, JH .

ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232

[10] Stochastic gradient boosting [J].

Friedman, JH .

COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 38 (04) :367-378

← 1 2 3 4 →