Deep learning: Computational aspects

被引:10
作者
Polson, Nicholas [1 ]
Sokolov, Vadim [2 ]
机构
[1] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
[2] George Mason Univ, Syst Engn & Operat Res, Fairfax, VA 22030 USA
关键词
deep learning; linear algebra; stochastic gradient descent; REPRESENTATION; VARIABLES; SEARCH;
D O I
10.1002/wics.1500
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we review computational aspects of deep learning (DL). DL uses network architectures consisting of hierarchical layers of latent variables to construct predictors for high-dimensional input-output models. Training a DL architecture is computationally intensive, and efficient linear algebra library is the key for training and inference. Stochastic gradient descent (SGD) optimization and batch sampling are used to learn from massive datasets. This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Deep Learning Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods Statistical Learning and Exploratory Methods of the Data Sciences > Neural Networks
引用
收藏
页数:17
相关论文
共 65 条
[1]  
[Anonymous], 2018, arXiv preprint arXiv:1802.04799 11, 2018
[2]  
[Anonymous], 1962, Journal of Mathematical Analysis and Applications
[3]  
[Anonymous], 1971, Wiss. Z.-Tech. Hochsch. Chem.
[4]  
[Anonymous], 2017, MICROSOFT UNVEILS PR
[5]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[6]  
BOLTYANSKII VG, 1960, NEWS AKAD NAUK SSSR, V24, P3
[7]   Optimization Methods for Large-Scale Machine Learning [J].
Bottou, Leon ;
Curtis, Frank E. ;
Nocedal, Jorge .
SIAM REVIEW, 2018, 60 (02) :223-311
[8]  
Brown StephenD., 2012, FIELD PROGRAMMABLE G, V180
[9]  
Bryson A.E., 1969, Applied Optimal Control
[10]  
Bryson A. E., 1961, P HARV U S DIG COMP, V72