共 50 条
[12]
Audio-Visual Action Recognition Using Transformer Fusion Network
[J].
APPLIED SCIENCES-BASEL,
2024, 14 (03)
[13]
MPEG-7 audio-visual indexing test-bed for video retrieval
[J].
INTERNET IMAGING V,
2004, 5304
:319-329
[15]
Efficient Audio-Visual Speech Enhancement Using Deep U-Net With Early Fusion of Audio and Video Information and RNN Attention Blocks
[J].
IEEE ACCESS,
2021, 9
:137584-137598
[16]
Continuous Phoneme Recognition based on Audio-Visual Modality Fusion
[J].
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN),
2022,
[17]
Robot Command Interface Using an Audio-Visual Speech Recognition System
[J].
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS,
2009, 5856
:869-+
[18]
TOWARDS GENERATING AMBISONICS USING AUDIO-VISUAL CUE FOR VIRTUAL REALITY
[J].
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP),
2019,
:2012-2016
[19]
AUDIO-VISUAL SPEECH INPAINTING WITH DEEP LEARNING
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021),
2021,
:6653-6657
[20]
An asynchronous DBN for audio-visual speech recognition
[J].
2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP,
2006,
:154-+