ON THE PHONETIC INFORMATION IN ULTRASONIC MICROPHONE SIGNALS

被引：8

作者：

Livescu, Karen ^{[1
]}

Zhu, Bo ^{[2
]}

Glass, James ^{[2
]}

机构：

[1] Toyota Technol Inst, Chicago, IL 60637 USA

[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA

来源：

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年

关键词：

Speech recognition; ultrasonic; multimodal; RECOGNITION;

D O I：

10.1109/ICASSP.2009.4960660

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We study the phonetic information in the signal from an ultrasonic "microphone", a device that emits an ultrasonic wave toward a speaker and receives the reflected, Doppler-shifted signal. This can be used in addition to audio to improve automatic speech recognition. This work is an effort to better understand the ultrasonic signal, and potentially to determine a set of natural sub-word units. We present classification and clustering experiments on CVC and VCV sequences in speaker-dependent and multi-speaker settings. Using a set of ultrasonic spectral features and diagonal Gaussian models, it is possible to distinguish all consonants and most vowels. When clustering the confusion data, the consonant clusters mostly correspond to places and manners of articulation; the vowel data roughly clusters into high, low, and rounded vowels.

引用

页码：4621 / +

页数：3

共 10 条

[1] DETWEILER C, 2008, ULTRASONIC SPEECH CA
[2] A probabilistic framework for segment-based speech recognition
Glass, JR
[J]. COMPUTER SPEECH AND LANGUAGE, 2003, 17 (2-3) : 137 - 152
[3] JENNINGS DL, 1995, ICASSP
[4] Similarity structure in visual speech perception and optical phonetic signals
Jiang, Jintao
Auer, Edward T.
Alwan, Abeer
Keating, Patricia A.
Bernstein, Lynnf E.
[J]. PERCEPTION & PSYCHOPHYSICS, 2007, 69 (07): : 1070 - 1083
[5] Kalgaonkar K., 2008, ICASSP
[6] Ultrasonic doppler sensor for voic activity detection
Kalgaonkar, Kaustubh
Hu, Rongquiang
Raj, Bhiksha
[J]. IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (10) : 754 - 757
[7] Recent advances in the automatic recognition of audiovisual speech
Potamianos, G
Neti, C
Gravier, G
Garg, A
Senior, AW
[J]. PROCEEDINGS OF THE IEEE, 2003, 91 (09) : 1306 - 1326
[8] EFFECTS OF TRAINING ON VISUAL RECOGNITION OF CONSONANTS
WALDEN, BE
PROSEK, RA
MONTGOMERY, AA
SCHERR, CK
JONES, CJ
[J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1977, 20 (01): : 130 - 145
[9] XUE J, 2005, AUD VIS SPEECH PROC
[10] Zhu B., 2007, INTERSPEECH

← 1 →