UNSUPERVISED LEARNING OF ACOUSTIC FEATURES VIA DEEP CANONICAL CORRELATION ANALYSIS

被引:0
作者
Wang, Weiran [1 ]
Arora, Raman [2 ]
Livescu, Karen [1 ]
Bilmes, Jeff A. [3 ]
机构
[1] TTI Chicago, Chicago, IL 60637 USA
[2] Johns Hopkins Univ, Baltimore, MD USA
[3] Univ Washington, Seattle, WA 98195 USA
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
multi-view learning; neural networks; deep canonical correlation analysis; XRMB; articulatory measurements;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
It has been previously shown that, when both acoustic and articulatory training data are available, it is possible to improve phonetic recognition accuracy by learning acoustic features from this multi-view data with canonical correlation analysis (CCA). In contrast with previous work based on linear or kernel CCA, we use the recently proposed deep CCA, where the functional form of the feature mapping is a deep neural network. We apply the approach on a speaker-independent phonetic recognition task using data from the University of Wisconsin X-ray Microbeam Database. Using a tandem-style recognizer on this task, deep CCA features improve over earlier multi-view approaches as well as over articulatory inversion and typical neural network-based tandem features. We also present a new stochastic training approach for deep CCA, which produces both faster training and better-performing features.
引用
收藏
页码:4590 / 4594
页数:5
相关论文
共 28 条
[1]  
Akaho Shotaro, 2001, P INT M PSYCH SOC IM
[2]  
Andrienko G., 2013, Introduction, P1
[3]  
[Anonymous], 2012, INT WORKSH STAT MACH
[4]  
[Anonymous], P IEEE INT C AC SPEE
[5]  
[Anonymous], 2008, Advances in Neural Information Processing Systems, DOI DOI 10.7751/mitpress/8996.003.0015
[6]  
[Anonymous], 2012, ABS12070580 CORR
[7]  
[Anonymous], NIPS WORKSH DEEP LEA
[8]  
[Anonymous], 1994, Connectionist Speech Recognition: A Hybrid Approach
[9]  
[Anonymous], 2012, MLSLP
[10]  
Arora Raman, 2014, P IEEE INT C AC SPEE