HIFI-AV: An Audio-visual Corpus for Spoken Language Human-Machine Dialogue Research in Spanish

被引:0
|
作者
Fernandez-Martinez, Fernando [1 ]
Manuel Lucas-Cuesta, Juan [1 ]
Barra Chicote, Roberto [1 ]
Ferreiros, Javier [1 ]
Macias-Guarasa, Javier [2 ]
机构
[1] Univ Politecn Madrid, ETSI Telecomunicac, E-28040 Madrid, Spain
[2] Univ Alcala, Escuela Politecn Super, Alcala De Henares 28871, Madrid, Spain
来源
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2010年
关键词
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
In this paper, we describe a new multi-purpose audio-visual database on the context of speech interfaces for controlling household electronic devices. The database comprises speech and video recordings of 19 speakers interacting with a HIFI audio box by means of a spoken dialogue system. Dialogue management is based on Bayesian Networks and the system is provided with contextual information handling strategies. Each speaker was requested to fulfil different sets of specific goals following predefined scenarios, according to both different complexity levels and degrees of freedom or initiative allowed to the user. Due to a careful design and its size, the recorded database allows comprehensive studies on speech recognition, speech understanding, dialogue modeling and management, microphone array based speech processing, and both speech and video-based acoustic source localisation. The database has been labelled for quality and efficiency studies on dialogue performance. The whole database has been validated through both objective and subjective tests.
引用
收藏
页码:2974 / 2980
页数:7
相关论文
共 3 条
  • [1] Audio-visual Evaluation and Detection of Word Prominence in a Human-Machine Interaction Scenario
    Heckmann, Martin
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2387 - 2390
  • [2] Steps Towards More Natural Human-Machine Interaction via Audio-Visual Word Prominence Detection
    Heckmann, Martin
    MULTIMODAL ANALYSES ENABLING ARTIFICIAL AGENTS IN HUMAN-MACHINE INTERACTION, 2015, 8757 : 15 - 24
  • [3] ES-Port: a Spontaneous Spoken Human-Human Technical Support Corpus for Dialogue Research in Spanish
    Garcia-Sardina, Laura
    Serras, Manex
    del Pozo, Arantza
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 781 - 785