Text Simplification to Specific Readability Levels

被引:5
作者
Alkaldi, Wejdan [1 ]
Inkpen, Diana [2 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Dept Informat Technol, Riyadh 11451, Saudi Arabia
[2] Univ Ottawa, Sch Elect Engn & Comp Sci, 800 King Edward, Ottawa, ON K1N 6N5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
text simplification; deep learning; reinforcement learning; readability level; data augmentation;
D O I
10.3390/math11092063
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The ability to read a document depends on the reader's skills and the text's readability level. In this paper, we propose a system that uses deep learning techniques to simplify texts in order to match a reader's level. We use a novel approach with a reinforcement learning loop that contains a readability classifier. The classifier's output is used to decide if more simplification is needed, until the desired readability level is reached. The simplification models are trained on data annotated with readability levels from the Newsela corpus. Our simplification models perform at sentence level, to simplify each sentence to meet the specified readability level. We use a version of the Newsela corpus aligned at the sentence level. We also produce an augmented dataset by automatically annotating more pairs of sentences using a readability-level classifier. Our text simplification models achieve better performance than state-of-the-art techniques for this task.
引用
收藏
页数:12
相关论文
共 44 条
[1]  
Agrawal S, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, P3757
[2]  
Alkaldi W., 2021, P AAAI 2021 SPRING S
[3]  
Aluisio S., 2010, P NAACL HLT 2010 YOU
[4]  
Aluisio Sandra, 2010, NAACL HLT WORKSHOP I, P1
[5]  
Alva-Manchego F, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P49
[6]  
[Anonymous], 1975, I SIMULATION TRAININ, DOI DOI 10.21236/ADA006655
[7]  
Bessou S., 2021, arXiv
[8]   SIMPLIFICATION V EASIFICATION - THE CASE OF LEGAL TEXTS [J].
BHATIA, VK .
APPLIED LINGUISTICS, 1983, 4 (01) :42-54
[9]  
Cardon R., 2020, P 28 INT C COMPUTATI
[10]  
Collantes Miguel, 2015, P 11 NATL NATURAL, P26