MULTI-STYLE MLP FEATURES FOR BN TRANSCRIPTION

被引:6
|
作者
Le, Viet-Bac [1 ]
Lamel, Lori [1 ]
Gauvain, Jean-Luc [1 ]
机构
[1] LIMSI CNRS, Spoken Language Proc Grp, F-91403 Orsay, France
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
MLP features; condition-specific adaptation; BN transcription;
D O I
10.1109/ICASSP.2010.5495116
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
It has become common practice to adapt acoustic models to specific-conditions (gender, accent, bandwidth) in order to improve the performance of speech-to-text (STT) transcription systems. With the growing interest in the use of discriminative features produced by a multi layer perceptron (MLP) in such systems, the question arise of whether it is necessary to specialize the MLP to particular conditions, and if so, how to incorporate the condition-specific MLP features in the system. This paper explores three approaches (adaptation, full training, and feature merging) to use condition-specific MLP features in a state-of-the-art BN STT system for French. The third approach without condition-specific adaptation was found to outperform the original models with condition-specific adaptation, and was found to perform almost as well as full training of multiple condition-specific HMMs.
引用
收藏
页码:4866 / 4869
页数:4
相关论文
共 50 条
  • [41] Authoring multi-style terrain with global-to-local control
    Zhang, Jian
    Li, Chen
    Zhou, Peichi
    Wang, Changbo
    He, Gaoqi
    Qin, Hong
    GRAPHICAL MODELS, 2022, 119
  • [42] Authoring multi-style terrain with global-to-local control
    Zhang, Jian
    Li, Chen
    Zhou, Peichi
    Wang, Changbo
    He, Gaoqi
    Qin, Hong
    Graphical Models, 2022, 119
  • [43] Pseudo-Supervised Learning for Semantic Multi-Style Transfer
    Kim, Saehun
    Do, Jeonghyeok
    Kim, Munchurl
    IEEE ACCESS, 2021, 9 (09): : 7930 - 7942
  • [44] Towards an Unsupervised Speaking Style Voice Building Framework: multi-style speaker diarization
    Lorenzo-Trueba, J.
    Martinez-Gonzalez, B.
    Lopez-Ludena, V.
    Barra-Chicote, R.
    Ferreiros, J.
    Yamagishi, J.
    Montero, J. M.
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2275 - 2278
  • [45] A CAD system for multi-style thermal functional design of clothing
    Mao Aihua
    Li Yi
    Luo Xiaonan
    Wang Ruomei
    Wang Shuxiao
    COMPUTER-AIDED DESIGN, 2008, 40 (09) : 916 - 930
  • [46] Pop Music Generation: From Melody to Multi-style Arrangement
    Zhu, Hongyuan
    Liu, Qi
    Yuan, Nicholas Jing
    Zhang, Kun
    Zhou, Guang
    Chen, Enhong
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (05)
  • [47] Multi-style Training for South African Call Centre Audio
    Heymans, Walter
    Davel, Marelie H.
    van Heerden, Charl
    ARTIFICIAL INTELLIGENCE RESEARCH, SACAIR 2021, 2022, 1551 : 111 - 124
  • [48] UNSUPERVISED LEARNING FOR MULTI-STYLE SPEECH SYNTHESIS WITH LIMITED DATA
    Liang, Shuang
    Miao, Chenfeng
    Chen, Minchuan
    Ma, Jun
    Wang, Shaojun
    Xiao, Jing
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6583 - 6587
  • [49] MSCap: Multi-Style Image Captioning with Unpaired Stylized Text
    Guo, Longteng
    Liu, Jing
    Yao, Peng
    Li, Jiangwei
    Lu, Hanqing
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4199 - 4208
  • [50] Is trading behavior stable across contexts? Evidence from style and multi-style investors
    Blackburn, Douglas W.
    Goetzmann, William N.
    Ukhov, Andrey D.
    QUANTITATIVE FINANCE, 2014, 14 (04) : 605 - 627