UNDERSTANDING HOW DEEP BELIEF NETWORKS PERFORM ACOUSTIC MODELLING

被引：0

作者：

Mohamed, Abdel-rahman ^{[1
]}

Hinton, Geoffrey ^{[1
]}

Penn, Gerald ^{[1
]}

机构：

[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 1A1, Canada

来源：

2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2012年

关键词：

Deep belief networks; neural networks; acoustic modeling;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep Belief Networks (DBNs) are a very competitive alternative to Gaussian mixture models for relating states of a hidden Markov model to frames of coefficients derived from the acoustic input. They are competitive for three reasons: DBNs can be fine-tuned as neural networks; DBNs have many non-linear hidden layers; and DBNs are generatively pre-trained. This paper illustrates how each of these three aspects contributes to the DBN's good recognition performance using both phone recognition performance on the TIMIT corpus and a dimensionally reduced visualization of the relationships between the feature vectors learned by the DBNs that preserves the similarity structure of the feature vectors at multiple scales. The same two methods are also used to

引用

页码：4273 / 4276

页数：4

共 9 条

[1] How Do Humans Process and Recognize Speech? [J].

Allen, Jont B. .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :567-577

[2]

[Anonymous], ASRU

[3]

Bourlard H.A., 1993, Connectionist Speech Recognition: A Hybrid Approach, DOI 10.1007/978-1-4615-3210-1

[4] Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition [J].

Dahl, George E. ;

Yu, Dong ;

Deng, Li ;

Acero, Alex .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :30-42

[5]

Hermansky H, 2000, INT CONF ACOUST SPEE, P1635, DOI 10.1109/ICASSP.2000.862024

[6] Training products of experts by minimizing contrastive divergence [J].

Hinton, GE .

NEURAL COMPUTATION, 2002, 14 (08) :1771-1800

[7] A fast learning algorithm for deep belief nets [J].

Hinton, Geoffrey E. ;

Osindero, Simon ;

Teh, Yee-Whye .

NEURAL COMPUTATION, 2006, 18 (07) :1527-1554

[8] Acoustic Modeling Using Deep Belief Networks [J].

Mohamed, Abdel-rahman ;

Dahl, George E. ;

Hinton, Geoffrey .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :14-22

[9]

van der Maaten L, 2008, J MACH LEARN RES, V9, P2579

← 1 →