Comparative Analysis of Models for Neural Machine Speech-to-Text Translation for Turkic State Languages

被引:0
作者
Nurmaganbet, Dauren [1 ]
Tukeyev, Ualsher [1 ]
Shormakova, Assem [1 ]
Zhumanov, Zhandos [1 ]
机构
[1] Al Farabi Kazakh Natl Univ, Alma Ata, Kazakhstan
来源
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, ACIIDS 2024 | 2024年 / 14796卷
关键词
Comparative analysis; Speech-to-text; Translation; Turkic state languages;
D O I
10.1007/978-981-97-4985-0_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we compare and evaluate speech recognition models for the Turkic state languages, namely Azerbaijani, Kazakh, Kyrgyz, Turkish, Turkmen, and Uzbek. For this purpose, experimental studies of neural speech recognition are being conducted for three available open-source models: Whisper is an ASR system by OpenAI, TurkicASR of ISSAI, and The Massively Multilingual Speech (MMS) project of Facebook AI's initiative. This project represents a key step towards streamlining the process of recording and processing meeting minutes in diverse Turkic languages. The scientific contribution of this article is the comparative analysis and selection of speech recognition models for the Turkic state languages based on ongoing experimental studies.
引用
收藏
页码:360 / 371
页数:12
相关论文
共 14 条
[1]  
[Anonymous], Russian Open Speech-to-Text Dataset
[2]  
Ardila R, 2020, PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), P4218
[3]  
Baevski A, 2020, Arxiv, DOI arXiv:2006.11477
[4]   Kazakh-Uzbek Speech Cascade Machine Translation on Complete Set of Endings [J].
Balabekova, Tolganay ;
Kairatuly, Bauyrzhan ;
Tukeyev, Ualsher .
ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023, 2023, 1864 :430-442
[5]  
Khassanov Y, 2021, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), P697
[6]  
Mamyrbayev O.Z., 2022, News of the National academy of sciences of the republic of Kazakhstan, V1, P58
[7]   End-to-End Speech Recognition in Agglutinative Languages [J].
Mamyrbayev, Orken ;
Alimhan, Keylan ;
Zhumazhanov, Bagashar ;
Turdalykyzy, Tolganay ;
Gusmanova, Farida .
INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT II, 2020, 12034 :391-401
[8]  
Musaev M, 2021, Arxiv, DOI arXiv:2107.14419
[9]   Multilingual Speech Recognition for Turkic Languages [J].
Mussakhojayeva, Saida ;
Dauletbek, Kaisar ;
Yeshpanov, Rustem ;
Varol, Huseyin Atakan .
INFORMATION, 2023, 14 (02)
[10]   KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset [J].
Mussakhojayeva, Saida ;
Janaliyeva, Aigerim ;
Mirzakhmetov, Almas ;
Khassanov, Yerbolat ;
Varol, Huseyin Atakan .
INTERSPEECH 2021, 2021, :2786-2790