Learning Deep Architectures for AI

被引：6072

作者：

Bengio, Yoshua ^{[1
]}

机构：

[1] Univ Montreal, Dept IRO, CP 6128, Montreal H3C 3J7, PQ, Canada

来源：

FOUNDATIONS AND TRENDS IN MACHINE LEARNING | 2009年 / 2卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

10.1561/2200000006

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Theoretical results suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g., in vision, language, and other AI-level tasks), one may need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the stateof- the-art in certain areas. This monograph discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.

引用

页码：1 / 127

页数：127

共 205 条

[1]

ACKLEY DH, 1985, COGNITIVE SCI, V9, P147

[2] Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks [J].

Ahmed, Amr ;

Yu, Kai ;

Xu, Wei ;

Gong, Yihong ;

Xing, Eric .

COMPUTER VISION - ECCV 2008, PT III, PROCEEDINGS, 2008, 5304 :69-+

[3]

Allgower E. L., 1980, SPRINGER SERIES COMP

[4] An introduction to MCMC for machine learning [J].

Andrieu, C ;

de Freitas, N ;

Doucet, A ;

Jordan, MI .

MACHINE LEARNING, 2003, 50 (1-2) :5-43

[5] An energy budget for signaling in the grey matter of the brain [J].

Attwell, D ;

Laughlin, SB .

JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2001, 21 (10) :1133-1145

[6]

Bagnell J. A., 2009, ADV NEURAL INFORM PR

[7]

Baxter J., 1995, Proceedings of the Eighth Annual Conference on Computational Learning Theory, P311, DOI 10.1145/225298.225336

[8] A Bayesian information theoretic model of learning to learn via multiple task sampling [J].

Baxter, J .

MACHINE LEARNING, 1997, 28 (01) :7-39

[9] Regularization and semi-supervised learning on large graphs [J].

Belkin, M ;

Matveeva, I ;

Niyogi, P .

LEARNING THEORY, PROCEEDINGS, 2004, 3120 :624-638

[10]

Belkin M., 2003, ADV NEURAL INFORM PR, V15

← 1 2 3 4 5 6 7 8 9 10 →