Automatic proficiency scoring for early-stage writing

被引:0
作者
Andersen M.R. [1 ]
Kabel K. [2 ]
Bremholm J. [3 ]
Bundsgaard J. [2 ]
Hansen L.K. [1 ]
机构
[1] DTU Compute, Technical University of Denmark, Richard Petersens Plads, Kongens Lyngby
[2] School of Education, Aarhus University, Tuborgvej 164, København NV
[3] National Centre for Reading, Danish University Colleges, Humletorvet 3, København V
来源
Computers and Education: Artificial Intelligence | 2023年 / 5卷
关键词
Automatic scoring; Danish; Early-stage literacy; Low-resource languages; Machine learning; Natural language processing; Rasch models; Writing proficiency;
D O I
10.1016/j.caeai.2023.100168
中图分类号
学科分类号
摘要
In this work, we study the feasibility of using machine learning and natural language processing methods for assessing writing proficiency in Danish with respect to text construction, sentence construction, and use of modifiers. Our work is based on the analytical framework for scoring early writing proposed by Kabel et al. (2022), where each text is first annotated by a human expert according to a predefined coding scheme and subsequently scored using statistical Rasch modeling (Rasch, 1960). We investigate two different strategies for estimating these scores automatically: 1) we propose a system for identifying the central linguistic features automatically mimicking the role of the human experts and 2) we train state-of-the-art discriminative machine learning models to predict the proficiency scores directly from the texts. We conduct a number of experiments to evaluate and compare the two approaches. Our results show strong and statistically significant correlations between the scores generated using the automatic system and scores based on human experts. We also estimate and report the reliability of the individual linguistic features in the automatic annotation system. Finally, we also propose and evaluate an extension of the statistical model, which allows the model to compensate for potential systematic errors in the automatic annotations. The article thereby contributes to the area of automated essay scoring (AES) and shows that it is possible to provide teachers with automated valid and reliable knowledge about the development of their students' writing competences, which they can use in their feedback to students. © 2023 The Authors
引用
收藏
相关论文
共 48 条
[1]  
Alikaniotis D., Yannakoudakis H., Rei M., Automatic text scoring using neural networks, Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: Long papers), pp. 715-725, (2016)
[2]  
Baeza-Yates R.A., Ribeiro-Neto B.A., Modern information retrieval, (1999)
[3]  
Beigman Klebanov B., Madnani N., Automated evaluation of writing – 50 years and counting, Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, pp. 7796-7810, (2020)
[4]  
Boulanger D., Kumar V., Deep learning in automated essay scoring, International conference on intelligent tutoring systems, pp. 294-299, (2018)
[5]  
Bremholm J., Bundsgaard J., Kabel K., Proficiency scales for early writing development, Writing and Pedagogy, 13, pp. 121-154, (2022)
[6]  
Bundsgaard J., Kabel K., Bremholm J., Validating scales for the early development of writing proficiency, Writing and Pedagogy, 13, pp. 89-120, (2022)
[7]  
Crossley S.A., Linguistic features in writing quality and development: An overview, Journal of Writing Research, 11, pp. 415-443, (2020)
[8]  
Devlin J., Chang M.W., Lee K., Toutanova K., BERT: Pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers), pp. 4171-4186, (2019)
[9]  
Diderichsen P., Elementær dansk grammatik [Elementary Danish grammar], (1987)
[10]  
Dozat T., Manning C.D., Deep biaffine attention for neural dependency parsing, 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, (2017)