Learning Deep Architectures for AI

被引:6072
作者
Bengio, Yoshua [1 ]
机构
[1] Univ Montreal, Dept IRO, CP 6128, Montreal H3C 3J7, PQ, Canada
来源
FOUNDATIONS AND TRENDS IN MACHINE LEARNING | 2009年 / 2卷 / 01期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1561/2200000006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Theoretical results suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g., in vision, language, and other AI-level tasks), one may need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the stateof- the-art in certain areas. This monograph discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
引用
收藏
页码:1 / 127
页数:127
相关论文
共 205 条
[1]  
ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2]   Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks [J].
Ahmed, Amr ;
Yu, Kai ;
Xu, Wei ;
Gong, Yihong ;
Xing, Eric .
COMPUTER VISION - ECCV 2008, PT III, PROCEEDINGS, 2008, 5304 :69-+
[3]  
Allgower E. L., 1980, SPRINGER SERIES COMP
[4]   An introduction to MCMC for machine learning [J].
Andrieu, C ;
de Freitas, N ;
Doucet, A ;
Jordan, MI .
MACHINE LEARNING, 2003, 50 (1-2) :5-43
[5]   An energy budget for signaling in the grey matter of the brain [J].
Attwell, D ;
Laughlin, SB .
JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2001, 21 (10) :1133-1145
[6]  
Bagnell J. A., 2009, ADV NEURAL INFORM PR
[7]  
Baxter J., 1995, Proceedings of the Eighth Annual Conference on Computational Learning Theory, P311, DOI 10.1145/225298.225336
[8]   A Bayesian information theoretic model of learning to learn via multiple task sampling [J].
Baxter, J .
MACHINE LEARNING, 1997, 28 (01) :7-39
[9]   Regularization and semi-supervised learning on large graphs [J].
Belkin, M ;
Matveeva, I ;
Niyogi, P .
LEARNING THEORY, PROCEEDINGS, 2004, 3120 :624-638
[10]  
Belkin M., 2003, ADV NEURAL INFORM PR, V15