Automatic Speech-to-Speech Translation of Educational Videos Using SeamlessM4T and Its Use for Future VR Applications

被引：0

作者：

Stefanel Gris, Lucas Rafael ^{[1
]}

Fernandes, Diogo ^{[1
]}

de Oliveira, Frederico Santos ^{[2
]}

Soares, Anderson ^{[1
]}

de Lima Soares, Telma Woerle ^{[1
]}

Galvao, Arlindo ^{[1
]}

机构：

[1] Univ Fed Goias, Goiania, Go, Brazil

[2] Univ Fed Mato Grosso, Campo Grande, MS, Brazil

来源：

2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES ABSTRACTS AND WORKSHOPS, VRW 2024 | 2024年

关键词：

Speech-to-Speech Translation; Speech Translation; Low-resource languages;

D O I：

10.1109/VRW62533.2024.00033

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic Speech-to-Speech Translation (S2ST) is crucial for VR, providing immersive experiences and global accessibility. For this task, cascade pipelines are often used, but it faces challenges in low-resource languages due to data scarcity, complexity, and maintenance, meanwhile end-to-end models, though promising, are still in early development. This study explores the latest SeamlessM4T model, an end-to-end S2ST architecture showing great potential for VR applications, and discusses its strengths and limitations in the context of educational VR for low-resource languages.

引用

页码：163 / 166

页数：4