Some experiments in audio-visual speech processing

被引：0

作者：

Chollet, G. ^{[1
]}

Landais, R. ^{[1
]}

Hueber, T. ^{[1
,2
]}

Bredin, H. ^{[1
]}

Mokbel, C. ^{[4
]}

Perrot, P. ^{[1
,3
]}

Zouari, L. ^{[1
]}

机构：

[1] CNRS, TSI Paris, LTCI, 46 Rue Barrault, F-75634 Paris 13, France

[2] ESPCI, Elect Lab, F-75005 Paris, France

[3] Inst Rec Criminelle Gendarmeri Natl IRCGN, F-93110 Paris, France

[4] Univ Balamand, Tripoli, Lebanon

来源：

ADVANCES IN NONLINEAR SPEECH PROCESSING | 2007年 / 4885卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Natural speech is produced by the vocal organs of a particular talker. The acoustic features of the speech signal must therefore be correlated with the movements of the articulators (lips, jaw, tongue, velum....). For instance, hearing impaired people (and not only them) improve their understanding of speech by lip reading. This chapter is an overview of audiovisual speech processing with emphasis on some experiments concerning recognition, speaker verification, indexing and corpus based synthesis from tongue and lips movements.

引用

页码：28 / +

页数：4

共 50 条

[1] Audio-visual speech processing and attention
Sams, M
PSYCHOPHYSIOLOGY, 2003, 40 : S5 - S6
[2] Somatosensory contribution to audio-visual speech processing
Ito, Takayuki
Ohashi, Hiroki
Gracco, Vincent L.
CORTEX, 2021, 143 : 195 - 204
[3] AUDIO-VISUAL SPEECH PROCESSING IN OLDER ADULTS
Burke, K. E.
Maguinness, C. T.
Setti, A.
Kenny, R. A.
Newell, F. N.
IRISH JOURNAL OF MEDICAL SCIENCE, 2010, 179 : S124 - S124
[4] Audio-visual graphical models for speech processing
Hershey, J
Attias, H
Jojic, N
Kristjansson, T
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 649 - 652
[5] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
Choi, Jeongsoo
Park, Se Jin
Kim, Minsu
Ro, Yong Man
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27315 - 27327
[6] The processing of audio-visual speech: empirical and neural bases
Campbell, Ruth
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2008, 363 (1493) : 1001 - 1010
[7] Audio-Visual Speech Processing Framework for Lip Reading
Nasr, Abdulbaset M.
Ramli, Abd Rahman
Hamiruce, Mohammad
Subramaniam, Shamala K.
2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 710 - +
[8] Fusion of audio-visual information for integrated speech processing
Nakamura, S
AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2001, 2091 : 127 - 143
[9] Special issue on joint audio-visual speech processing
Neti, C
Potamianos, G
Luettin, J
Vatikiotis-Bateson, E
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1151 - 1153
[10] Statistical multimodal integration for audio-visual speech processing
Nakamura, S
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (04): : 854 - 866

← 1 2 3 4 5 →