Some experiments in audio-visual speech processing

被引:0
|
作者
Chollet, G. [1 ]
Landais, R. [1 ]
Hueber, T. [1 ,2 ]
Bredin, H. [1 ]
Mokbel, C. [4 ]
Perrot, P. [1 ,3 ]
Zouari, L. [1 ]
机构
[1] CNRS, TSI Paris, LTCI, 46 Rue Barrault, F-75634 Paris 13, France
[2] ESPCI, Elect Lab, F-75005 Paris, France
[3] Inst Rec Criminelle Gendarmeri Natl IRCGN, F-93110 Paris, France
[4] Univ Balamand, Tripoli, Lebanon
来源
ADVANCES IN NONLINEAR SPEECH PROCESSING | 2007年 / 4885卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural speech is produced by the vocal organs of a particular talker. The acoustic features of the speech signal must therefore be correlated with the movements of the articulators (lips, jaw, tongue, velum....). For instance, hearing impaired people (and not only them) improve their understanding of speech by lip reading. This chapter is an overview of audiovisual speech processing with emphasis on some experiments concerning recognition, speaker verification, indexing and corpus based synthesis from tongue and lips movements.
引用
收藏
页码:28 / +
页数:4
相关论文
共 50 条
  • [1] Audio-visual speech processing and attention
    Sams, M
    PSYCHOPHYSIOLOGY, 2003, 40 : S5 - S6
  • [2] Somatosensory contribution to audio-visual speech processing
    Ito, Takayuki
    Ohashi, Hiroki
    Gracco, Vincent L.
    CORTEX, 2021, 143 : 195 - 204
  • [3] AUDIO-VISUAL SPEECH PROCESSING IN OLDER ADULTS
    Burke, K. E.
    Maguinness, C. T.
    Setti, A.
    Kenny, R. A.
    Newell, F. N.
    IRISH JOURNAL OF MEDICAL SCIENCE, 2010, 179 : S124 - S124
  • [4] Audio-visual graphical models for speech processing
    Hershey, J
    Attias, H
    Jojic, N
    Kristjansson, T
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 649 - 652
  • [5] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
    Choi, Jeongsoo
    Park, Se Jin
    Kim, Minsu
    Ro, Yong Man
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27315 - 27327
  • [6] The processing of audio-visual speech: empirical and neural bases
    Campbell, Ruth
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2008, 363 (1493) : 1001 - 1010
  • [7] Audio-Visual Speech Processing Framework for Lip Reading
    Nasr, Abdulbaset M.
    Ramli, Abd Rahman
    Hamiruce, Mohammad
    Subramaniam, Shamala K.
    2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 710 - +
  • [8] Fusion of audio-visual information for integrated speech processing
    Nakamura, S
    AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2001, 2091 : 127 - 143
  • [9] Special issue on joint audio-visual speech processing
    Neti, C
    Potamianos, G
    Luettin, J
    Vatikiotis-Bateson, E
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1151 - 1153
  • [10] Statistical multimodal integration for audio-visual speech processing
    Nakamura, S
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (04): : 854 - 866