Effects of Dialectal Code-Switching on Speech Modules: A Study using Egyptian Arabic Broadcast Speech

被引:6
|
作者
Chowdhury, Shammur A. [1 ]
Samih, Younes [1 ]
Eldesouki, Mohamed [2 ]
Ali, Ahmed [1 ]
机构
[1] HBKU, Qatar Comp Res Inst, Doha, Qatar
[2] Concordia Univ, Montreal, PQ, Canada
来源
INTERSPEECH 2020 | 2020年
关键词
code-switching; dialect identification; corpus; code mixing index;
D O I
10.21437/Interspeech.2020-2271
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The intra-utterance code-switching (CS) is defined as the alternation between two or more languages within the same utterance. Despite the fact that spoken dialectal code-switching (DCS) is more challenging than CS, it remains largely unexplored. In this study, we describe a method to build the first spoken DCS corpus. The corpus is annotated at the token-level minding both linguistic and acoustic cues for dialectal Arabic. For detailed analysis, we study Arabic automatic speech recognition (ASR), Arabic dialect identification (ADI), and natural language processing (NLP) modules for the DCS corpus. Our results highlight the importance of lexical information for discriminating the DCS labels. We observe that the performance of different models is highly dependent on the degree of code-mixing at the token-level as well as its complexity at the utterance-level.
引用
收藏
页码:2382 / 2386
页数:5
相关论文
共 50 条
  • [41] ADDRESSING ACCENT MISMATCH IN MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Tan, Zhili
    Fan, Xinghua
    Zhu, Hui
    Lin, Ed
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8259 - 8263
  • [42] Improving Transformer Based End-to-End Code-Switching Speech Recognition Using Language Identification
    Huang, Zheying
    Wang, Pei
    Wang, Jian
    Miao, Haoran
    Xu, Ji
    Zhang, Pengyuan
    APPLIED SCIENCES-BASEL, 2021, 11 (19):
  • [43] Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition
    Guo, Pengcheng
    Xu, Haihua
    Xie, Lei
    Chng, Eng Siong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1928 - 1932
  • [44] Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus
    Hamed, Injy
    Elmandy, Mohamed
    Abdennadher, Slim
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3805 - 3809
  • [45] Code-switching strategies in republic-age literary Latin: 'polyphony' and 'reported speech'
    Poccetti, Paolo
    STUDI E SAGGI LINGUISTICI, 2015, 53 (02): : 129 - 162
  • [46] On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
    Zeng, Zhiping
    Khassanov, Yerbolat
    Van Tung Pham
    Xu, Haihua
    Chng, Eng Siong
    Li, Haizhou
    INTERSPEECH 2019, 2019, : 2165 - 2169
  • [47] EXPLORING RETRAINING-FREE SPEECH RECOGNITION FOR INTRA-SENTENTIAL CODE-SWITCHING
    Huang, Zhen
    Zhuang, Xiaodan
    Liu, Daben
    Xiao, Xiaoqiang
    Zhang, Yuchen
    Siniscalchi, Sabato Marco
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6066 - 6070
  • [48] Albanian Verbs in Slavic Bilingual Speech: Between Code-Switching and Adaptation of Lexical Borrowings
    Morozova, M. S.
    Rusakov, A. Yu
    VESTNIK TOMSKOGO GOSUDARSTVENNOGO UNIVERSITETA FILOLOGIYA-TOMSK STATE UNIVERSITY JOURNAL OF PHILOLOGY, 2021, 74 : 113 - 129
  • [49] IITG-HingCoS corpus: A Hinglish code-switching database for automatic speech recognition
    Ganji, Sreeram
    Dhawan, Kunal
    Sinha, Rohit
    SPEECH COMMUNICATION, 2019, 110 : 76 - 89
  • [50] Investigating Bilingual Deep Neural Networks for Automatic Recognition of Code-switching Frisian Speech
    Yilmaz, Emre
    van den Heuvel, Henk
    van Leeuwen, David
    SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 159 - 166