ON THE PHONETIC INFORMATION IN ULTRASONIC MICROPHONE SIGNALS

被引:8
作者
Livescu, Karen [1 ]
Zhu, Bo [2 ]
Glass, James [2 ]
机构
[1] Toyota Technol Inst, Chicago, IL 60637 USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
来源
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年
关键词
Speech recognition; ultrasonic; multimodal; RECOGNITION;
D O I
10.1109/ICASSP.2009.4960660
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We study the phonetic information in the signal from an ultrasonic "microphone", a device that emits an ultrasonic wave toward a speaker and receives the reflected, Doppler-shifted signal. This can be used in addition to audio to improve automatic speech recognition. This work is an effort to better understand the ultrasonic signal, and potentially to determine a set of natural sub-word units. We present classification and clustering experiments on CVC and VCV sequences in speaker-dependent and multi-speaker settings. Using a set of ultrasonic spectral features and diagonal Gaussian models, it is possible to distinguish all consonants and most vowels. When clustering the confusion data, the consonant clusters mostly correspond to places and manners of articulation; the vowel data roughly clusters into high, low, and rounded vowels.
引用
收藏
页码:4621 / +
页数:3
相关论文
共 10 条
  • [1] DETWEILER C, 2008, ULTRASONIC SPEECH CA
  • [2] A probabilistic framework for segment-based speech recognition
    Glass, JR
    [J]. COMPUTER SPEECH AND LANGUAGE, 2003, 17 (2-3) : 137 - 152
  • [3] JENNINGS DL, 1995, ICASSP
  • [4] Similarity structure in visual speech perception and optical phonetic signals
    Jiang, Jintao
    Auer, Edward T.
    Alwan, Abeer
    Keating, Patricia A.
    Bernstein, Lynnf E.
    [J]. PERCEPTION & PSYCHOPHYSICS, 2007, 69 (07): : 1070 - 1083
  • [5] Kalgaonkar K., 2008, ICASSP
  • [6] Ultrasonic doppler sensor for voic activity detection
    Kalgaonkar, Kaustubh
    Hu, Rongquiang
    Raj, Bhiksha
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (10) : 754 - 757
  • [7] Recent advances in the automatic recognition of audiovisual speech
    Potamianos, G
    Neti, C
    Gravier, G
    Garg, A
    Senior, AW
    [J]. PROCEEDINGS OF THE IEEE, 2003, 91 (09) : 1306 - 1326
  • [8] EFFECTS OF TRAINING ON VISUAL RECOGNITION OF CONSONANTS
    WALDEN, BE
    PROSEK, RA
    MONTGOMERY, AA
    SCHERR, CK
    JONES, CJ
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1977, 20 (01): : 130 - 145
  • [9] XUE J, 2005, AUD VIS SPEECH PROC
  • [10] Zhu B., 2007, INTERSPEECH