An audio-visual corpus for speech perception and automatic speech recognition (L)

被引:754
作者
Cooke, Martin
Barker, Jon
Cunningham, Stuart
Shao, Xu
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Univ Sheffield, Dept Human Commun Sci, Sheffield S1 4DP, S Yorkshire, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1121/1.2229005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
An audio-visual corpus has been collected to support the use of common material in speech perception and automatic speech recognition studies. The corpus consists of high-quality audio and video recordings of 1000 sentences spoken by each of 34 talkers. Sentences are simple, syntactically identical phrases such as "place green at B 4 now." Intelligibility tests using the audio signals suggest that the-material is easily identifiable in quiet and low levels of stationary noise. The annotated corpus is available on the web for research use. (c) 2006 Acoustical Society of America.
引用
收藏
页码:2421 / 2424
页数:4
相关论文
共 14 条
[1]   RECOGNITION OF PLOSIVE SYLLABLES IN NOISE - COMPARISON OF AN AUDITORY MODEL WITH HUMAN-PERFORMANCE [J].
AINSWORTH, WA ;
MEYER, GF .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 96 (02) :687-694
[2]  
[Anonymous], 1993, OBJ MEAS ACT SPEECH, P56
[3]  
*ANSI, 1997, S351991 ANSI
[4]   A speech corpus for multitalker communications research [J].
Bolia, RS ;
Nelson, WT ;
Ericson, MA ;
Simpson, BD .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2000, 107 (02) :1065-1066
[5]   Informational and energetic masking effects in the perception of multiple simultaneous talkers [J].
Brungart, DS ;
Simpson, BD ;
Ericson, MA ;
Scott, KR .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 110 (05) :2527-2538
[6]   A glimpsing model of speech perception in noise [J].
Cooke, M .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 119 (03) :1562-1573
[7]   FACTORS GOVERNING THE INTELLIGIBILITY OF SPEECH SOUNDS [J].
FRENCH, NR ;
STEINBERG, JC .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1947, 19 (01) :90-119
[8]   ADEQUACY OF AUDITORY MODELS TO PREDICT HUMAN INTERNAL REPRESENTATION OF SPEECH SOUNDS [J].
GHITZA, O .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1993, 93 (04) :2160-2171
[9]   Speech intelligibility prediction in hearing-impaired listeners based on a psychoacoustically motivated perception model [J].
Holube, I ;
Kollmeier, B .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (03) :1703-1716
[10]  
Moore TJ, 1981, AGARD C P 331 AUR CO, V2, P1