Automatic Speech-to-Speech Translation of Educational Videos Using SeamlessM4T and Its Use for Future VR Applications

被引:0
|
作者
Stefanel Gris, Lucas Rafael [1 ]
Fernandes, Diogo [1 ]
de Oliveira, Frederico Santos [2 ]
Soares, Anderson [1 ]
de Lima Soares, Telma Woerle [1 ]
Galvao, Arlindo [1 ]
机构
[1] Univ Fed Goias, Goiania, Go, Brazil
[2] Univ Fed Mato Grosso, Campo Grande, MS, Brazil
来源
2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES ABSTRACTS AND WORKSHOPS, VRW 2024 | 2024年
关键词
Speech-to-Speech Translation; Speech Translation; Low-resource languages;
D O I
10.1109/VRW62533.2024.00033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech-to-Speech Translation (S2ST) is crucial for VR, providing immersive experiences and global accessibility. For this task, cascade pipelines are often used, but it faces challenges in low-resource languages due to data scarcity, complexity, and maintenance, meanwhile end-to-end models, though promising, are still in early development. This study explores the latest SeamlessM4T model, an end-to-end S2ST architecture showing great potential for VR applications, and discusses its strengths and limitations in the context of educational VR for low-resource languages.
引用
收藏
页码:163 / 166
页数:4
相关论文
empty
未找到相关数据