To go deep or wide in learning?

被引:0
作者
Pandey, Gaurav [1 ]
Dukkipati, Ambedkar [1 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India
来源
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 33 | 2014年 / 33卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multilayered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.
引用
收藏
页码:724 / 732
页数:9
相关论文
共 23 条
[11]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[12]   Training invariant support vector machines [J].
Decoste, D ;
Schölkopf, B .
MACHINE LEARNING, 2002, 46 (1-3) :161-190
[13]  
Fischer Asja, 2012, Progress in Pattern Recognition, Image Analysis, ComputerVision, and Applications. Proceedings 17th Iberoamerican Congress, CIARP 2012, P14, DOI 10.1007/978-3-642-33275-3_2
[14]   Training products of experts by minimizing contrastive divergence [J].
Hinton, GE .
NEURAL COMPUTATION, 2002, 14 (08) :1771-1800
[15]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[16]  
Huang GB, 2004, IEEE IJCNN, P985
[17]  
Lamblin P, 2010, NIPS 2010 DEEP LEARN
[18]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[19]   Distinctive image features from scale-invariant keypoints [J].
Lowe, DG .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) :91-110
[20]  
Nair V., 2010, P 27 INT C MACH LEAR, P807