Automatic Stop List Generation for Clustering Recognition Results of Call Center Recordings
被引:0
|
作者:
Popova, Svetlana
论文数: 0引用数: 0
h-index: 0
机构:
St Petersburg State Univ, St Petersburg 199034, Russia
Scrol, St Petersburg, RussiaSt Petersburg State Univ, St Petersburg 199034, Russia
Popova, Svetlana
[1
,3
]
Krivosheeva, Tatiana
论文数: 0引用数: 0
h-index: 0
机构:
STC innovat Ltd, Petersburg, VA USASt Petersburg State Univ, St Petersburg 199034, Russia
Krivosheeva, Tatiana
[2
]
Korenevsky, Maxim
论文数: 0引用数: 0
h-index: 0
机构:
STC innovat Ltd, Petersburg, VA USASt Petersburg State Univ, St Petersburg 199034, Russia
Korenevsky, Maxim
[2
]
机构:
[1] St Petersburg State Univ, St Petersburg 199034, Russia
[2] STC innovat Ltd, Petersburg, VA USA
[3] Scrol, St Petersburg, Russia
来源:
SPEECH AND COMPUTER
|
2014年
/
8773卷
关键词:
clustering;
stop words;
stop list generation;
ASR;
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
The paper deals with the problem of automatic stop list generation for processing recognition results of call center recordings, in particular for the purpose of clustering. We propose and test a supervised domain dependent method of automatic stop list generation. The method is based on finding words whose removal increases the dissimilarity between documents in different clusters, and decreases dissimilarity between documents within the same cluster. This approach is shown to be efficient for clustering recognition results of recordings with different quality, both on datasets that contain the same topics as the training dataset, and on datasets containing other topics.