HIFI-AV: An Audio-visual Corpus for Spoken Language Human-Machine Dialogue Research in Spanish

被引：0

作者：

Fernandez-Martinez, Fernando ^{[1
]}

Manuel Lucas-Cuesta, Juan ^{[1
]}

Barra Chicote, Roberto ^{[1
]}

Ferreiros, Javier ^{[1
]}

Macias-Guarasa, Javier ^{[2
]}

机构：

[1] Univ Politecn Madrid, ETSI Telecomunicac, E-28040 Madrid, Spain

[2] Univ Alcala, Escuela Politecn Super, Alcala De Henares 28871, Madrid, Spain

来源：

LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2010年

关键词：

D O I：

暂无

中图分类号：

H [语言、文字];

学科分类号：

05 ;

摘要：

In this paper, we describe a new multi-purpose audio-visual database on the context of speech interfaces for controlling household electronic devices. The database comprises speech and video recordings of 19 speakers interacting with a HIFI audio box by means of a spoken dialogue system. Dialogue management is based on Bayesian Networks and the system is provided with contextual information handling strategies. Each speaker was requested to fulfil different sets of specific goals following predefined scenarios, according to both different complexity levels and degrees of freedom or initiative allowed to the user. Due to a careful design and its size, the recorded database allows comprehensive studies on speech recognition, speech understanding, dialogue modeling and management, microphone array based speech processing, and both speech and video-based acoustic source localisation. The database has been labelled for quality and efficiency studies on dialogue performance. The whole database has been validated through both objective and subjective tests.

引用

页码：2974 / 2980

页数：7

共 3 条

[1] Audio-visual Evaluation and Detection of Word Prominence in a Human-Machine Interaction Scenario
Heckmann, Martin
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2387 - 2390
[2] Steps Towards More Natural Human-Machine Interaction via Audio-Visual Word Prominence Detection
Heckmann, Martin
MULTIMODAL ANALYSES ENABLING ARTIFICIAL AGENTS IN HUMAN-MACHINE INTERACTION, 2015, 8757 : 15 - 24
[3] ES-Port: a Spontaneous Spoken Human-Human Technical Support Corpus for Dialogue Research in Spanish
Garcia-Sardina, Laura
Serras, Manex
del Pozo, Arantza
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 781 - 785

← 1 →