Data-selective Transfer Learning for Multi-Domain Speech Recognition

被引:0
|
作者
Doulaty, Mortaza [1 ]
Saz, Oscar [1 ]
Hain, Thomas [1 ]
机构
[1] Univ Sheffield, Speech & Hearing Grp, Sheffield, S Yorkshire, England
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
data selection; transfer learning; negative transfer; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Negative transfer in training of acoustic models for automatic speech recognition has been reported in several contexts such as domain change or speaker characteristics. This paper proposes a novel technique to overcome negative transfer by efficient selection of speech data for acoustic model training. Here data is chosen on relevance for a specific target. A sub modular function based on likelihood ratios is used to determine how acoustically similar each training utterance is to a target test set. The approach is evaluated on a wide domain data set, covering speech from radio and TV broadcasts, telephone conversations, meetings, lectures and read speech. Experiments demonstrate that the proposed technique both finds relevant data and limits negative transfer. Results on a 6 hour test set show a relative improvement of 4% with data selection over using all data in PLP based models, and 2% with DNN features.
引用
收藏
页码:2897 / 2901
页数:5
相关论文
共 50 条
  • [1] Multi-domain spoken language understanding with transfer learning
    Jeong, Minwoo
    Lee, Gary Geunbae
    SPEECH COMMUNICATION, 2009, 51 (05) : 412 - 424
  • [2] Multi-Domain and Multi-Task Learning for Human Action Recognition
    Liu, An-An
    Xu, Ning
    Nie, Wei-Zhi
    Su, Yu-Ting
    Zhang, Yong-Dong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (02) : 853 - 867
  • [3] A Multi-Domain Feature Learning Method for Visual Place Recognition
    Yin, Peng
    Xu, Lingyun
    Li, Xueqian
    Yin, Chen
    Li, Yingli
    Srivatsan, Rangaprasad Arun
    Li, Lu
    Ji, Jianmin
    He, Yuqing
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 319 - 324
  • [4] Exploiting data diversity in multi-domain federated learning
    Madni, Hussain Ahmad
    Umer, Rao Muhammad
    Foresti, Gian Luca
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (02):
  • [5] Effective Data Augmentation with Multi-Domain Learning GANs
    Yamaguchi, Shin'ya
    Kanai, Sekitoshi
    Eda, Takeharu
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6566 - 6574
  • [6] WENETSPEECH: A 10000+HOURS MULTI-DOMAIN MANDARIN CORPUS FOR SPEECH RECOGNITION
    Zhang, Binbin
    Lv, Hang
    Guo, Pengcheng
    Shao, Qijie
    Yang, Chao
    Xie, Lei
    Xu, Xin
    Bu, Hui
    Chen, Xiaoyu
    Zeng, Chenchen
    Wu, Di
    Peng, Zhendong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6182 - 6186
  • [7] Multi-Domain Recurrent Neural Network Language Model for Medical Speech Recognition
    Tilk, Ottokar
    Alumaee, Tanel
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 : 149 - +
  • [8] Multi-domain few-shot image recognition with knowledge transfer
    Li, Mingxi
    Wang, Ronggui
    Yang, Juan
    Xue, Lixia
    Hu, Min
    NEUROCOMPUTING, 2021, 442 : 64 - 72
  • [9] SOURCE DOMAIN DATA SELECTION FOR IMPROVED TRANSFER LEARNING TARGETING DYSARTHRIC SPEECH RECOGNITION
    Xiong, Feifei
    Barker, Jon
    Yue, Zhengjun
    Christensen, Heidi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7424 - 7428
  • [10] Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking
    Campagna, Giovanni
    Foryciarz, Agata
    Moradshahi, Mehrad
    Lam, Monica S.
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 122 - 132