A Taxonomy of Deep Convolutional Neural Nets for Computer Vision

被引:121
作者
Srinivas, Suraj [1 ]
Sarvadevabhatla, Ravi Kiran [1 ]
Mopuri, Konda Reddy [1 ]
Prabhu, Nikita [1 ]
Kruthiventi, Srinivas S. S. [1 ]
Babu, R. Venkatesh [1 ]
机构
[1] Indian Inst Sci, Video Analyt Lab, Dept Computat & Data Sci, Bangalore, Karnataka, India
关键词
deep learning; convolutional neural networks; object classification; recurrent neural networks; supervised learning;
D O I
10.3389/frobt.2015.00036
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Traditional architectures for solving computer vision problems and the degree of success they enjoyed have been heavily reliant on hand-crafted features. However, of late, deep learning techniques have offered a compelling alternative that of automatically learning problem-specific features. With this new paradigm, every problem in computer vision is now being re-examined from a deep learning perspective. Therefore, it has become important to understand what kind of deep networks are suitable for a given problem. Although general surveys of this fast-moving paradigm (i.e., deep-networks) exist, a survey specific to computer vision is missing. We specifically consider one form of deep networks widely used in computer vision convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN and then examine the broad variations proposed over time to suit different applications. We hope that our recipe-style survey will serve as a guide, particularly for novice practitioners intending to use deep-learning techniques for computer vision.
引用
收藏
页数:13
相关论文
共 112 条
[1]  
[Anonymous], 2014, ABS14065726 CORR
[2]   VQA: Visual Question Answering [J].
Antol, Stanislaw ;
Agrawal, Aishwarya ;
Lu, Jiasen ;
Mitchell, Margaret ;
Batra, Dhruv ;
Zitnick, C. Lawrence ;
Parikh, Devi .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2425-2433
[3]  
Ba J., 2014, PROC INT C NEURAL IN, P2654
[4]   Neural Codes for Image Retrieval [J].
Babenko, Artem ;
Slesarev, Anton ;
Chigorin, Alexandr ;
Lempitsky, Victor .
COMPUTER VISION - ECCV 2014, PT I, 2014, 8689 :584-599
[5]  
Bahdanau Dzmitry, 2016, Arxiv, DOI DOI 10.48550/ARXIV.1409.0473
[6]   Lucas-Kanade 20 years on: A unifying framework [J].
Baker, S ;
Matthews, I .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 56 (03) :221-255
[7]  
Bengio Y., 2013, ARXIV13061091
[8]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[9]  
Bishop C., 2006, PATTERN RECOGNITION
[10]  
Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339