Techniques for Rapid and Robust Topic Identification of Conversational Telephone Speech

被引:0
作者
Wintrode, Jonathan [1 ]
Kulp, Scott [1 ]
机构
[1] Rutgers State Univ, US Dept Def, Piscataway, NJ 08855 USA
来源
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年
关键词
topic identification; speech recognition; error trade-offs; TF-IDF;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the impact of automatic speech recognition (ASR) errors on the accuracy of topic identification in conversational telephone speech. We present a modified TF-IDF feature weighting calculation that provides significant robustness under various recognition error conditions. For our experiments we take conversations from the Fisher corpus to produce I-best and lattice outputs using a single recognizer tuned to run at various speeds. We use an SVM classifier to perform topic identification on the output. We observe classifiers incorporating confidence information to be significantly more robust to errors than those treating output as unweighted text.
引用
收藏
页码:1515 / 1518
页数:4
相关论文
共 14 条
  • [1] [Anonymous], P 2 INT C LANG RES E
  • [2] Baeza-Yates R., 1999, MODERN INFORM RETRIE, P27
  • [3] Cieri C., 2004, LREC
  • [4] COLTHURST T, 2007, P INT 2007 ANTW BELG
  • [5] HAZEN T, 2008, P ICASS LAS VEG APR
  • [6] Hazen Timothy J., 2007, P ASRU KYOT DEC
  • [7] Mamou J., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P51, DOI 10.1145/1148170.1148183
  • [8] MCCARLEY JS, 2000, P 23 ACM SIGIR C INF, P342
  • [9] PESKIN B, 1993, P ARPA WORKSH HUM LA
  • [10] PESKIN B, P ICASSP 96, V1, P303