Improving Speaker Segmentation via Speaker Identification and Text Segmentation

被引：0

作者：

Li, Runxin ^{[1
]}

Schultz, Tanja ^{[1
]}

Jin, Qin ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Language Technol Inst, InterACT, Pittsburgh, PA 15213 USA

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

关键词：

speaker diarization; speaker segmentation; speaker identification; text segmentation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speaker segmentation is an essential part of a speaker diarization system. Common segmentation systems usually miss speaker change points when speakers switch fast. These errors seriously confuse the following speaker clustering step and result in high overall speaker diarization error rates. In this paper two methods are proposed to deal with this problem: The first approach uses speaker identification techniques to boost speaker segmentation. And the second approach applies text segmentation methods to improve the performance of speaker segmentation. Experiments on Quaero speaker diarization evaluation data shows that our methods achieve up to 45% relative reduction in the speaker diarization error and 64% relative increase in the speaker change detection recall rate over the baseline system. Moreover, both these two approaches can be considered as post-processing steps over the baseline segmentation, therefore, they can be applied in any speaker diarization systems.

引用

页码：928 / 931

页数：4

共 12 条

[1]

Barras C., 2004, P FALL RICH TRANSCR

[2] A tutorial on text-independent speaker verification [J].

Bimbot, F ;

Bonastre, JF ;

Fredouille, C ;

Gravier, G ;

Magrin-Chagnolleau, I ;

Meignier, S ;

Merlin, T ;

Ortega-García, J ;

Petrovska-Delacrétaz, D ;

Reynolds, DA .

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) :430-451

[3]

Chen S., 1998, P DARPA BROADC NEWS, V8, P127

[4] A dynamic programming algorithm for linear text segmentation [J].

Fragkou, P ;

Petridis, V ;

Kehagias, A .

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2004, 23 (02) :179-197

[5]

GAUVAIN JL, 1998, P INT C SPOK LANG PR, V4, P1335

[6]

GISH H, 1996, 4 INT C SPOK LANG IC, P466

[7]

HAN Y, 2006, P ICASSP 2006, V1, P1169

[8]

Jin Q., 2004, ICSLP

[9]

MEIGNIER S, 2000, P IEEE INT C AC SPEE, V2, P1201

[10] ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].

REYNOLDS, DA ;

ROSE, RC .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01) :72-83

← 1 2 →