AISPEECH-SJTU ASR SYSTEM FOR THE ACCENTED ENGLISH SPEECH RECOGNITION CHALLENGE

被引:9
|
作者
Tan, Tian [1 ,2 ]
Lu, Yizhou [2 ]
Ma, Rao [2 ]
Zhu, Sen [1 ]
Guo, Jiaqi [1 ]
Qian, Yanmin [2 ]
机构
[1] AISpeech Ltd, Suzhou, Peoples R China
[2] Shanghai Jiao Tong Univ, AI Inst, Dept Comp Sci & Engn, SpeechLab,MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
关键词
accent speech recognition; accent adaptation; data augmentation; RNNLM;
D O I
10.1109/ICASSP39728.2021.9414471
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes the AISpeech-SJTU ASR system for the Interspeech-2020 Accented English Speech Recognition Challenge (AESRC). This task is challenging due to the diversity of pronunciation accuracy, intonation speed and pronunciation of some syllables. All participants were restricted to develop their systems based on the speech and text corpora provided by the organizer. To work around the data-scarcity problem, data augmentation was first explored including noise simulation, SpecAugment, speed perturbation and TTS simulation. Moreover, SOTA CNN-transformer-based joint CTC-attention system was built and accent adaptation was proposed to train an accent robust system. Finally, the first-pass recognition hypotheses generated from CTC head were rescored by forward, backward LSTM-LM and the attention head. Our system with the best configuration achieves second place in the challenge, resulting in a word error rate (WER) of 4.00% on dev set and 4.47% WER on test set, while WER on test set of the top-performing, second runner-up and official baseline systems are 4.06%, 4.52%, 8.29%, respectively.
引用
收藏
页码:6413 / 6417
页数:5
相关论文
共 50 条
  • [1] AISPEECH-SJTU ACCENT IDENTIFICATION SYSTEM FOR THE ACCENTED ENGLISH SPEECH RECOGNITION CHALLENGE
    Huang, Houjun
    Xiang, Xu
    Yang, Yexin
    Ma, Rao
    Qian, Yanmin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6254 - 6258
  • [2] The XMUSPEECH System for Accented English Automatic Speech Recognition
    Tong, Fuchuan
    Li, Tao
    Liao, Dexin
    Xia, Shipeng
    Li, Song
    Hong, Qingyang
    Li, Lin
    APPLIED SCIENCES-BASEL, 2022, 12 (03):
  • [3] Inequity in Popular Speech Recognition Systems for Accented English Speech
    Ike, Chinaemere
    Polsley, Seth
    Hammond, Tracy
    COMPANION PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2022 COMPANION, 2022, : 66 - 68
  • [4] Automatic Speech Recognition of Multiple Accented English Data
    Vergyri, Dimitra
    Lamel, Lori
    Gauvain, Jean-Luc
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1652 - +
  • [5] THE ACCENTED ENGLISH SPEECH RECOGNITION CHALLENGE 2020: OPEN DATASETS, TRACKS, BASELINES, RESULTS AND METHODS
    Shi, Xian
    Yu, Fan
    Lu, Yizhou
    Liang, Yuhao
    Feng, Qiangze
    Wang, Daliang
    Qian, Yanmin
    Xie, Lei
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6918 - 6922
  • [6] ISI ASR System for the Low Resource Speech Recognition Challenge for Indian Languages
    Billa, Jayadev
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3207 - 3211
  • [7] The FawAI ASR System for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
    Sun, Yujia
    Ge, Bing
    Chen, Bo
    Fu, Zhen
    He, Jinxin
    Gao, Hongwei
    Wang, Xue
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 512 - 516
  • [8] Recognition of foreign-accented vocoded speech by native English listeners
    Yang, Jing
    Barrett, Jenna
    Yin, Zhigang
    Xu, Li
    ACTA ACUSTICA, 2023, 7
  • [9] THE FAWAISPEECH SYSTEM FOR MULTI-CHANNEL SPEECH RECOGNITION IN ICMC-ASR CHALLENGE
    Sun, Yujia
    He, Jinxin
    Zhang, Yi
    Liang, Xiaoming
    Wang, Ziyan
    Fu, Zhen
    Chen, Bo
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 19 - 20
  • [10] Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022
    Zatvornitskiy, Alexander
    SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 712 - 718