ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems

被引：0

作者：

Kleindienst, Jan ^{[1
]}

Curin, Jan ^{[1
]}

Labsky, Martin ^{[1
]}

机构：

[1] IBM Res Corp, Prague, Czech Republic

来源：

HUMAN-COMPUTER INTERACTION, PT I | 2009年 / 5610卷

关键词：

Dialog; evaluation; scoring; multimodal; speech recognition;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose a new approach toward evaluation of spoken dialog systems. The novelty of our method is based on utilization of domain-specific knowledge combined with the deterministic measurement of dialog system performance on a set of individual tasks within the domain. The proposed methodology thus attempts to answer questions such as: "How well is my dialog system performing on a specific domain?", "How much has my dialog system improved since the previous version?", "How much is my dialog system better/worse than other dialog systems performing on that domain?"

引用

页码：287 / 294

页数：8

共 11 条

[1]

ALLEN J, 2007, 22 C ART INT

[2]

Carroll JohnM., 2001, Human-Computer Interaction in the New Millennium

[3]

CASSELL J, 2002, IMAGINA 2002

[4]

Gandhe Sudeep, 2008, Proc. of SIGdial08, P172

[5]

Graesser AC, 2001, AI MAG, V22, P39

[6] The PARADISE evaluation framework: Issues and findings [J].

Hajdinjak, Melita ;

Mihelic, France .

COMPUTATIONAL LINGUISTICS, 2006, 32 (02) :263-272

[7]

Jurafsky D., 2018, SPEECH LANGUAGE PROC

[8]

LEBIGOT L, 2008, HUMAN COMPUTER INTER, P269

[9]

Nielsen Jakob, 1994, USABILITY INSPECTION, P25, DOI [10.5555/189200.189209, DOI 10.5555/189200.189209, DOI 10.1089/TMJ.2010.0114]

[10]

Walker M., 2000, Natural Language Engineering, V6, P363, DOI 10.1017/S1351324900002503

← 1 2 →