How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?

被引:10
作者
Bhardwaj, Kartikeya [1 ]
Li, Guihong [2 ]
Marculescu, Radu [2 ]
机构
[1] Arm Inc, San Jose, CA 95134 USA
[2] Univ Texas Austin, Austin, TX 78712 USA
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR46437.2021.01329
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance. To this end, we introduce a new metric called NN-Mass to quantify how effectively information flows through DNNs. Moreover, we empirically show that NN-Mass also works for other types of skip connections, e.g., for ResNets, Wide-ResNets (WRNs), and MobileNets, which contain addition-type skip connections (i.e., residuals or inverted residuals). As such, for both DenseNet-like CNNs and ResNets/WRNs/MobileNets, our theoretically grounded NN-Mass can identify models with similar accuracy, despite having significantly different size/compute requirements. Detailed experiments on both synthetic and real datasets (e.g., MNIST, CIFAR-10, CIFAR100, ImageNet) provide extensive evidence for our insights. Finally, the closed-form equation of our NN-Mass enables us to design significantly compressed DenseNets (for CIFAR10) and MobileNets (for ImageNet) directly at initialization without time-consuming training and/or searching.(1)
引用
收藏
页码:13493 / 13502
页数:10
相关论文
共 40 条
[1]  
[Anonymous], 2019, ADV NEURAL INFORM PR
[2]  
[Anonymous], 2018, ARXIV180908848
[3]  
[Anonymous], 2018, INT C MACH LEARN
[4]  
[Anonymous], 2011, STRUCTURE DYNAMICS N
[5]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[6]  
Barabasi Albert-Laszlo, 2016, NETWORK SCI
[7]  
Glorot Xavier, 2010, JMLR WORKSHOP C P, P249, DOI DOI 10.1109/LGRS.2016.2565705
[8]  
Golowich N., 2018, ARXIV181002281
[9]   Visualizing non-equilibrium lithiation of spinel oxide via in situ transmission electron microscopy [J].
He, Kai ;
Zhang, Sen ;
Li, Jing ;
Yu, Xiqian ;
Meng, Qingping ;
Zhu, Yizhou ;
Hu, Enyuan ;
Sun, Ke ;
Yun, Hongseok ;
Yang, Xiao-Qing ;
Zhu, Yimei ;
Gan, Hong ;
Mo, Yifei ;
Stach, Eric A. ;
Murray, Christopher B. ;
Su, Dong .
NATURE COMMUNICATIONS, 2016, 7
[10]  
Howard A. G., 2017, ARXIV