Incremental Dependency Parsing and Disfluency Detection in Spoken Learner English

被引:4
作者
Moore, Russell [1 ]
Caines, Andrew [1 ]
Graham, Calbert [1 ]
Buttery, Paula [1 ]
机构
[1] Univ Cambridge, Dept Theoret & Appl Linguist, Automated Language Teaching & Assessment Inst, Cambridge, England
来源
TEXT, SPEECH, AND DIALOGUE (TSD 2015) | 2015年 / 9302卷
关键词
Spoken language; Learner english; Learner proficiency; Disfluency detection; Dependency parsing;
D O I
10.1007/978-3-319-24033-6_53
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the suitability of state-of-the-art natural language processing (NLP) tools for parsing the spoken language of second language learners of English. The task of parsing spoken learner-language is important to the domains of automated language assessment (ALA) and computer-assisted language learning (CALL). Due to the non-canonical nature of spoken language (containing filled pauses, nonstandard grammatical variations, hesitations and other disfluencies) and compounded by a lack of available training data, spoken language parsing has been a challenge for standard NLP tools. Recently the Redshift parser (Honnibal et al. In: Proceedings of CoNLL (2013)) has been shown to be successful in identifying grammatical relations and certain disfluencies in native speaker spoken language, returning unlabelled dependency accuracy of 90.5% and a disfluency F-measure of 84.1% (Honnibal & Johnson: TACL 2, 131-142 (2014)). We investigate how this parser handles spoken data from learners of English at various proficiency levels. Firstly, we find that Redshift's parsing accuracy on non-native speech data is comparable to Honnibal & Johnson's results, with 91.1% of dependency relations correctly identified. However, disfluency detection is markedly down, with an F-measure of just 47.8%. We attempt to explain why this should be, and investigate the effect of proficiency level on parsing accuracy. We relate our findings to the use of NLP technology for CALL and ALA applications.
引用
收藏
页码:470 / 479
页数:10
相关论文
共 32 条
  • [1] [Anonymous], 2003, P 41 ANN M ASS COMP
  • [2] [Anonymous], P 10 C COMP NAT LANG
  • [3] [Anonymous], 2014, T ASSOC COMPUT LING
  • [4] [Anonymous], 2001, Lexical-functional Syntax
  • [5] [Anonymous], 2004, P ACL
  • [6] Ballesteros M, 2013, COMPUTATIONAL LINGUI, V39
  • [7] Biber Douglas, 1995, Dimensions of register variation: A cross -linguistic comparison
  • [8] Brazil D., 1995, A GRAMMAR OF SPEECH
  • [9] Briscoe Ted, 2006, P COLING ACL 2006 IN
  • [10] Caines A., 2014, P 1 JOINT WORKSH STA