共 50 条
[42]
Deep Audio-Visual Saliency: Baseline Model and Data
[J].
ETRA 2020 SHORT PAPERS: ACM SYMPOSIUM ON EYE TRACKING RESEARCH & APPLICATIONS,
2020,
[43]
Edged based Audio-Visual Speech enhancement demonstrator
[J].
INTERSPEECH 2024,
2024,
:2032-2033
[44]
Audio-Visual Multilevel Fusion for Speech and Speaker Recognition
[J].
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5,
2008,
:379-382
[45]
An audio-visual corpus for multimodal automatic speech recognition
[J].
Journal of Intelligent Information Systems,
2017, 49
:167-192
[46]
A COMPACT FORMULATION OF TURBO AUDIO-VISUAL SPEECH RECOGNITION
[J].
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP),
2014,
[49]
Audio-Visual Speech Recognition Using A Two-Step Feature Fusion Strategy
[J].
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR),
2021,
:1896-1903
[50]
Automatic Assessment of Chinese Dysarthria Using Audio-visual Vowel Graph Attention Network
[J].
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,
2025, 33
:1454-1466