MULTI-STYLE MLP FEATURES FOR BN TRANSCRIPTION

被引：6

作者：

Le, Viet-Bac ^{[1
]}

Lamel, Lori ^{[1
]}

Gauvain, Jean-Luc ^{[1
]}

机构：

[1] LIMSI CNRS, Spoken Language Proc Grp, F-91403 Orsay, France

来源：

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年

关键词：

MLP features; condition-specific adaptation; BN transcription;

D O I：

10.1109/ICASSP.2010.5495116

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

It has become common practice to adapt acoustic models to specific-conditions (gender, accent, bandwidth) in order to improve the performance of speech-to-text (STT) transcription systems. With the growing interest in the use of discriminative features produced by a multi layer perceptron (MLP) in such systems, the question arise of whether it is necessary to specialize the MLP to particular conditions, and if so, how to incorporate the condition-specific MLP features in the system. This paper explores three approaches (adaptation, full training, and feature merging) to use condition-specific MLP features in a state-of-the-art BN STT system for French. The third approach without condition-specific adaptation was found to outperform the original models with condition-specific adaptation, and was found to perform almost as well as full training of multiple condition-specific HMMs.

引用

页码：4866 / 4869

页数：4

共 50 条

[41] Authoring multi-style terrain with global-to-local control
Zhang, Jian
Li, Chen
Zhou, Peichi
Wang, Changbo
He, Gaoqi
Qin, Hong
GRAPHICAL MODELS, 2022, 119
[42] Authoring multi-style terrain with global-to-local control
Zhang, Jian
Li, Chen
Zhou, Peichi
Wang, Changbo
He, Gaoqi
Qin, Hong
Graphical Models, 2022, 119
[43] Pseudo-Supervised Learning for Semantic Multi-Style Transfer
Kim, Saehun
Do, Jeonghyeok
Kim, Munchurl
IEEE ACCESS, 2021, 9 (09): : 7930 - 7942
[44] Towards an Unsupervised Speaking Style Voice Building Framework: multi-style speaker diarization
Lorenzo-Trueba, J.
Martinez-Gonzalez, B.
Lopez-Ludena, V.
Barra-Chicote, R.
Ferreiros, J.
Yamagishi, J.
Montero, J. M.
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2275 - 2278
[45] A CAD system for multi-style thermal functional design of clothing
Mao Aihua
Li Yi
Luo Xiaonan
Wang Ruomei
Wang Shuxiao
COMPUTER-AIDED DESIGN, 2008, 40 (09) : 916 - 930
[46] Pop Music Generation: From Melody to Multi-style Arrangement
Zhu, Hongyuan
Liu, Qi
Yuan, Nicholas Jing
Zhang, Kun
Zhou, Guang
Chen, Enhong
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (05)
[47] Multi-style Training for South African Call Centre Audio
Heymans, Walter
Davel, Marelie H.
van Heerden, Charl
ARTIFICIAL INTELLIGENCE RESEARCH, SACAIR 2021, 2022, 1551 : 111 - 124
[48] UNSUPERVISED LEARNING FOR MULTI-STYLE SPEECH SYNTHESIS WITH LIMITED DATA
Liang, Shuang
Miao, Chenfeng
Chen, Minchuan
Ma, Jun
Wang, Shaojun
Xiao, Jing
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6583 - 6587
[49] MSCap: Multi-Style Image Captioning with Unpaired Stylized Text
Guo, Longteng
Liu, Jing
Yao, Peng
Li, Jiangwei
Lu, Hanqing
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4199 - 4208
[50] Is trading behavior stable across contexts? Evidence from style and multi-style investors
Blackburn, Douglas W.
Goetzmann, William N.
Ukhov, Andrey D.
QUANTITATIVE FINANCE, 2014, 14 (04) : 605 - 627

← 1 2 3 4 5 →