Effects of Dialectal Code-Switching on Speech Modules: A Study using Egyptian Arabic Broadcast Speech

被引:6
|
作者
Chowdhury, Shammur A. [1 ]
Samih, Younes [1 ]
Eldesouki, Mohamed [2 ]
Ali, Ahmed [1 ]
机构
[1] HBKU, Qatar Comp Res Inst, Doha, Qatar
[2] Concordia Univ, Montreal, PQ, Canada
来源
INTERSPEECH 2020 | 2020年
关键词
code-switching; dialect identification; corpus; code mixing index;
D O I
10.21437/Interspeech.2020-2271
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The intra-utterance code-switching (CS) is defined as the alternation between two or more languages within the same utterance. Despite the fact that spoken dialectal code-switching (DCS) is more challenging than CS, it remains largely unexplored. In this study, we describe a method to build the first spoken DCS corpus. The corpus is annotated at the token-level minding both linguistic and acoustic cues for dialectal Arabic. For detailed analysis, we study Arabic automatic speech recognition (ASR), Arabic dialect identification (ADI), and natural language processing (NLP) modules for the DCS corpus. Our results highlight the importance of lexical information for discriminating the DCS labels. We observe that the performance of different models is highly dependent on the degree of code-mixing at the token-level as well as its complexity at the utterance-level.
引用
收藏
页码:2382 / 2386
页数:5
相关论文
共 50 条
  • [21] Open Domain Continuous Filipino Speech Recognition with Code-Switching
    Ang, Federico
    Miyanaga, Yoshikazu
    Guevara, Rowena Cristina
    Cajote, Rhandley
    Bayona, Michael Gringo Angelo
    2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2014, : 2301 - 2304
  • [22] Language-specific Characteristic Assistance for Code-switching Speech Recognition
    Song, Tongtong
    Xu, Qiang
    Ge, Meng
    Wang, Longbiao
    Shi, Hao
    Lv, Yongjie
    Lin, Yuqin
    Dang, Jianwu
    INTERSPEECH 2022, 2022, : 3924 - 3928
  • [23] DATA AUGMENTATION FOR END-TO-END CODE-SWITCHING SPEECH RECOGNITION
    Du, Chenpeng
    Li, Hao
    Lu, Yizhou
    Wang, Lan
    Qian, Yanmin
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 194 - 200
  • [24] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Wei, Shuang
    Lian, Jie
    Li, Yijie
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [25] Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech
    Yilmaz, Emre
    van den Heuvel, Henk
    van Leeuwen, David A.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1933 - 1937
  • [26] Code-Switching in Speech of Tundra Yukaghir: Bi- and Multilingual Repetition
    Kurilova, Samona N.
    NAUCHNYI DIALOG, 2023, 12 (05): : 72 - 92
  • [27] UnitDiff: A Unit-Diffusion Model for Code-Switching Speech Synthesis
    Chen, Ke
    Huang, Zhihua
    He, Liang
    Yan, Yonghong
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 1051 - 1055
  • [28] End-to-End Language Diarization for Bilingual Code-Switching Speech
    Liu, Hexin
    Perera, Leibny Paola Garcia
    Zhang, Xinyi
    Dauwels, Justin
    Khong, Andy W. H.
    Khudanpur, Sanjeev
    Styles, Suzy J.
    INTERSPEECH 2021, 2021, : 1489 - 1493
  • [29] Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition
    Zhou, Xinyuan
    Yilmaz, Emre
    Long, Yanhua
    Li, Yijie
    Li, Haizhou
    INTERSPEECH 2020, 2020, : 1042 - 1046
  • [30] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Yanhua Long
    Shuang Wei
    Jie Lian
    Yijie Li
    EURASIP Journal on Audio, Speech, and Music Processing, 2021