Denoised Bottleneck Features From Deep Autoencoders for Telephone Conversation Analysis

被引：13

作者：

Janod, Killian ^{[1
]}

Morchid, Mohamed ^{[2
]}

Dufour, Richard ^{[2
]}

Linares, Georges ^{[2
]}

De Mori, Renato ^{[3
]}

机构：

[1] Univ Avignon, Ctr Enseignement & Rech Informat, F-84911 Avignon, France

[2] Univ Avignon, Lab Informat Avignon, F-84911 Avignon, France

[3] McGill Univ, Comp Sci, Montreal, PQ H3A 2A7, Canada

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2017年 / 25卷 / 09期

关键词：

Automatic speech recognition (ASR); denoisng autoencoders (DAEs); multilayer neural networks; speech analytics; stacked autoencoders (SAEs); ARCHITECTURES;

D O I：

10.1109/TASLP.2017.2718843

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Automatic transcription of spoken documents is affected by automatic transcription errors that are especially frequent when speech is acquired in severe noisy conditions. Automatic speech recognition errors induce errors in the linguistic features used for a variety of natural language processing tasks. Recently, denoisng autoencoders (DAE) and stacked autoencoders (SAE) have been proposed with interesting results for acoustic feature denoising tasks. This paper deals with the recovery of corrupted linguistic features in spoken documents. Solutions based on DAEs and SAEs are considered and evaluated in a spoken conversation analysis task. In order to improve conversation theme classification accuracy, the possibility of combining abstractions obtained from manual and automatic transcription features is considered. As a result, two original representations of highly imperfect spoken documents are introduced. They are based on bottleneck features of a supervised autoencoder that takes advantage of both noisy and clean transcriptions to improve the robustness of error prone representations. Experimental results on a spoken conversation theme identification task show substantial accuracy improvements obtained with the proposed recovery of corrupted features.

引用

页码：1505 / 1516

页数：12

共 54 条

[1] [Anonymous], P 2014 C EMP METH NA
[2] [Anonymous], 2010, P PYTH SCI C
[3] [Anonymous], 2015, 3 INT C LEARNING REP
[4] [Anonymous], P MACHINE LEARNING R
[5] [Anonymous], SPOKEN LANGUAGE UNDE
[6] [Anonymous], 1987, THESIS U P M CURIE P
[7] [Anonymous], 2011, P INT C FLOR IT 27 3
[8] [Anonymous], 2013, P INT C MACH LEARN I
[9] [Anonymous], 2013, P INT C LEARN REPR
[10] [Anonymous], LARGE SCALE KERNEL M

← 1 2 3 4 5 6 →