Automatically enriching spoken corpora with syntactic information for linguistic studies

被引:0
作者
Nasr, Alexis [1 ]
Bechet, Frederic [1 ]
Favre, Benoit [1 ]
Bazillon, Thierry [1 ]
Deulofeu, Jose [1 ]
Valli, Andre [1 ]
机构
[1] Aix Marseille Univ, CNRS, LIF UMR 7279, Marseille, France
来源
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2014年
关键词
speech processing; parsing; syntactic annotation;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Syntactic parsing of speech transcriptions faces the problem of the presence of disfluencies that break the syntactic structure of the utterances. We propose in this paper two solutions to this problem. The first one relies on a disfluencies predictor that detects disfluencies and removes them prior to parsing. The second one integrates the disfluencies in the syntactic structure of the utterances and train a disfluencies aware parser.
引用
收藏
页码:854 / 858
页数:5
相关论文
共 8 条
  • [1] Abeille A., 2003, Treebanks
  • [2] [Anonymous], INT C LANG RES EV LR
  • [3] [Anonymous], 2009, 10 ANN C INT SPEECH
  • [4] Bazillon Thierry, 2012, P LREC IST
  • [5] Bohnet B., 2010, P 23 INT C COMP LING, P89
  • [6] McDonald R., 2005, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT/EMNLP), P523
  • [7] Okazaki N., 2007, CRFSUITE FAST IMPLEM
  • [8] Roux Joseph Le, 2011, 49 ANN M ASS COMP LI