Root Cause Analysis of Miscommunication Hotspots in Spoken Dialogue Systems

被引：3

作者：

Georgiladakis, Spiros ^{[1
]}

Athanasopoulou, Georgia ^{[1
]}

Meena, Raveesh ^{[2
]}

Lopes, Jose ^{[2
]}

Chorianopoulou, Arodami ^{[3
]}

Palogiannidi, Elisavet ^{[3
]}

Iosif, Elias ^{[1
,4
]}

Skantze, Gabriel ^{[2
]}

Potamianos, Alexandros ^{[1
,4
]}

机构：

[1] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens, Greece

[2] KTH Speech Mus & Hearing, Stockholm, Sweden

[3] Tech Univ Crete, Sch Elect & Comp Engn, Khania, Greece

[4] Athena Res & Innovat Ctr, Athens, Greece

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

miscommunication detection; miscommunication root causes; spoken dialogue systems;

D O I：

10.21437/Interspeech.2016-1273

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A major challenge in Spoken Dialogue Systems (SDS) is the detection of problematic communication (hotspots), as well as the classification of these hotspots into different types (root cause analysis). In this work, we focus on two classes of root cause, namely, erroneous speech recognition vs. other (e.g., dialogue strategy). Specifically, we propose an automatic algorithm for detecting hotspots and classifying root causes in two subsequent steps. Regarding hotspot detection, various lexico-semantic features are used for capturing repetition patterns along with affective features. Lexico-semantic and repetition features are also employed for root cause analysis. Both algorithms are evaluated with respect to the Let's Go dataset (bus information system). In terms of classification unweighted average recall, performance of 80% and 70% is achieved for hotspot detection and root cause analysis, respectively.

引用

页码：1156 / 1160

页数：5

共 25 条

[1]

[Anonymous], 2005, PROCEEDING INTERSPEE

[2]

Batliner A., 2003, P ISCA TUT ISCA TUT

[3]

Bradley M. M., 1999, C1 U FLOR CTR RES PS, V30, P25, DOI DOI 10.1109/MIC.2008.114

[4] Random forests [J].

Breiman, L .

MACHINE LEARNING, 2001, 45 (01) :5-32

[5]

Busso C., 2010, SOCIAL EMOTIONS NATU, P110

[6] Prosodic and other cues to speech recognition failures [J].

Hirschberg, J ;

Litman, D ;

Swerts, M .

SPEECH COMMUNICATION, 2004, 43 (1-2) :155-175

[7] REPAIRING CONVERSATIONAL MISUNDERSTANDINGS AND NON-UNDERSTANDINGS [J].

HIRST, G ;

MCROY, S ;

HEEMAN, P ;

EDMONDS, P ;

HORTON, D .

SPEECH COMMUNICATION, 1994, 15 (3-4) :213-229

[8] Improvements to Platt's SMO algorithm for SVM classifier design [J].

Keerthi, SS ;

Shevade, SK ;

Bhattacharyya, C ;

Murthy, KRK .

NEURAL COMPUTATION, 2001, 13 (03) :637-649

[9] Error detection in spoken human-machine interaction [J].

Krahmer E. ;

Swerts M. ;

Theune M. ;

Weegels M. .

International Journal of Speech Technology, 2001, 4 (1) :19-30

[10]

Litman DJ, 1999, P 37 ANN M ASS COMP, DOI DOI 10.3115/1034678.1034729

← 1 2 3 →