Alector: A Parallel Corpus of Simplified French Texts with Alignments of Misreadings by Poor and Dyslexic Readers

被引:0
作者
Gala, Nuria [1 ]
Tack, Anais [2 ,3 ]
Javourey-Drevet, Ludivine [4 ,5 ]
Francois, Thomas [2 ]
Ziegler, Johannes C. [4 ]
机构
[1] Aix Marseille Univ, Lab Parole & Langage, LPL CNRS, UMR 7309, Marseille, France
[2] UCLouvain, CENTAL, Louvain La Neuve, Belgium
[3] Katholieke Univ Leuven, Imec Res Grp, ITEC, Leuven, Belgium
[4] Aix Marseille Univ, Lab Psychol Cognit, LPC CNRS, UMR 7290, Marseille, France
[5] Aix Marseille Univ, Apprentissage Didact Evaluat Format EA 4671, Marseille, France
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年
关键词
Parallel corpora; text simplification; readability; linguistic complexity; misreading; poor-readers; dyslexia; LEXICAL DATABASE;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we present a new parallel corpus addressed to researchers, teachers, and speech therapists interested in text simplification as a means of alleviating difficulties in children learning to read. The corpus is composed of excerpts drawn from 79 authentic literary (tales, stories) and scientific (documentary) texts commonly used in French schools for children aged between 7 to 9 years old. The excerpts were manually simplified at the lexical, morpho-syntactic, and discourse levels in order to propose a parallel corpus for reading tests and for the development of automatic text simplification tools. A sample of 21 poor-reading and dyslexic children with an average reading delay of 2.5 years read a portion of the corpus. The transcripts of readings errors were integrated into the corpus with the goal of identifying lexical difficulty in the target population. By means of statistical testing, we provide evidence that the manual simplifications significantly reduced reading errors, highlighting that the words targeted for simplification were not only well-chosen but also substituted with substantially easier alternatives. The entire corpus is available for consultation through a web interface and available on demand for research purposes.
引用
收藏
页码:1353 / 1361
页数:9
相关论文
共 32 条
  • [1] [Anonymous], 2007, THESIS
  • [2] Billami M., 2018, P 27 INT C COMP LING, P2570
  • [3] Coster William, 2011, P 49 ANN M ASS COMP, P665
  • [4] Franc<comma>ois T., 2016, ACT C TRAIT AUT LANG, P15
  • [5] Gala N., 2018, Langue francaise, V199, P123
  • [6] Gala Nuria, 2016, P WORKSH COMP LING L, P59
  • [7] Relationship Between Apparent Elastic Modulus and Microstructural Parameters of Senile Vertebral Trabecular Bone
    Gong, He
    Zhu, Dong
    Xiao, Zhitao
    Zhang, Xizheng
    Zhang, Ming
    [J]. 2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7, 2010, : 1353 - 1356
  • [8] MANULEX:: A grade-level lexical database from French elementary school readers
    Lété, B
    Sprenger-Charolles, L
    Colé, P
    [J]. BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 2004, 36 (01): : 156 - 166
  • [9] Mullis I. V. S., 2017, Progress in International Reading Literacy Study
  • [10] Nandiegou M., 2018, THESIS