Synthesizing Near Native-accented Speech for a Non-native Speaker by Imitating the Pronunciation and Prosody of a Native Speaker

被引:1
|
作者
Chung, Raymond [1 ,2 ]
Mak, Brian [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Clear Water Bay, Hong Kong, Peoples R China
[2] Logist & Supply Chain MultiTech R&D Ctr, Pok Fu Lam, Hong Kong, Peoples R China
来源
关键词
text-to-speech; neural speech synthesis; accent conversion; FOREIGN ACCENT;
D O I
10.21437/Interspeech.2022-11124
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper investigates how to reduce foreign accent in the synthesis of native (L1) speech for a non-native (L2) speaker. We focus on two major aspects of foreign accents: mispronunciations and improper prosody (rhythm, phonemes duration, and pauses). Firstly, to reduce mispronunciations, the mel-spectrograms generated by an L2 text-to-speech (TTS) model are fed to a pre-trained speech recognizer and the mispronunciation information is fed back to the TTS model during back-propagation to help the model learn correct native mel-spectrograms. Secondly, to imitate L1 speech prosody, a recent data augmentation (DA) technique originally proposed for speaking style transfer is applied to transfer L1 speaking style to L2 speakers. The DA technique creates additional L2 speeches when L2 speakers try to imitate L1 speeches. Automatic speech recognition on native-accented speeches synthesized from non-native speakers by the proposed method gives a lower word error rate. The speaker embeddings produced by a pre-trained speaker verifier from the original L2 speakers' speech and their synthesized speech are highly similar. Finally, subjective MOS scores on the synthesized speech show that they have good quality and reduced accentedness.
引用
收藏
页码:4302 / 4306
页数:5
相关论文
共 50 条
  • [21] AN EXAMINATION OF REPAIR IN SMALL GROUP (NON-NATIVE-NON-NATIVE SPEAKER) AND LARGE CLASS (NATIVE SPEAKER-NON-NATIVE SPEAKER) DISCUSSIONS
    Maciejewicz, Elizabeth J.
    3RD INTERNATIONAL CONFERENCE OF EDUCATION, RESEARCH AND INNOVATION (ICERI2010), 2010, : 1808 - 1819
  • [22] Science teaching as a non-native English speaker
    Labouta, Hagar I.
    NATURE REVIEWS BIOENGINEERING, 2023, 1 (05): : 306 - 307
  • [23] Age Estimation in Foreign-accented Speech by Native and Non-native Speakers
    Gnevsheva, Ksenia
    Burkle, Daniel
    LANGUAGE AND SPEECH, 2020, 63 (01) : 166 - 183
  • [24] Non-native speaker pause patterns closely correspond to those of native speakers at different speech rates
    Matzinger, Theresa
    Ritt, Nikolaus
    Fitch, W. Tecumseh
    PLOS ONE, 2020, 15 (04):
  • [25] Long-term within-speaker consistency of filled pauses in native and non-native speech
    de Boer, Meike M.
    Quene, Hugo
    Heeren, Willemijn F. L.
    JASA EXPRESS LETTERS, 2022, 2 (03):
  • [26] Pronunciation accuracy and intelligibility of non-native speech
    Loukina, Anastassia
    Lopez, Melissa
    Evanini, Keelan
    Suenderinann-Oeft, David
    Ivanov, Alexei V.
    Zechner, Klaus
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1917 - 1921
  • [27] MY DAUGHTER LIKES TO HORSEBACK TOO NATIVE SPEAKER SPEECH TO NATIVE AND NON-NATIVE SPEAKERS (ABSTRACT OF A RESEARCH-PROJECT)
    BARKMAN, B
    WINER, L
    INTERNATIONAL JOURNAL OF THE SOCIOLOGY OF LANGUAGE, 1981, (28) : 115 - 115
  • [28] Writer's Presence in English Native and Non-Native Speaker Research Articles
    Behnam, Biook
    Mirzapour, Fathemeh
    Mozaheb, Mohammad Amin
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN ELT, 2014, 98 : 369 - 374
  • [29] Speaker and Accent Variation Are Handled Differently: Evidence in Native and Non-Native Listeners
    Kriengwatana, Buddhamas
    Terry, Josephine
    Chladkova, Katerina
    Escudero, Paola
    PLOS ONE, 2016, 11 (06):
  • [30] A CORPUS-BASED STUDY OF BE-COPULA IN NATIVE SPEAKER AND NON-NATIVE SPEAKER LEARNERS' ARGUMENTATIVE ESSAYS
    Aziz, Roslina Abdul
    Don, Zuraidah Mohd
    JOURNAL OF NUSANTARA STUDIES-JONUS, 2022, 7 (02): : 21 - 43