OAMIP: Optimizing ANN Architectures Using Mixed-Integer Programming

被引:4
|
作者
ElAraby, Mostafa [1 ,3 ]
Wolf, Guy [1 ,4 ]
Carvalho, Margarida [2 ,3 ]
机构
[1] Mila Quebec AI Inst, Montreal, PQ, Canada
[2] CIRRELT, Montreal, PQ, Canada
[3] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ, Canada
[4] Univ Montreal, Dept Math & Stat, Montreal, PQ, Canada
来源
INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2023 | 2023年 / 13884卷
基金
加拿大自然科学与工程研究理事会;
关键词
Pruning Neural Networks; Mixed Integer Programming; Neurons Ranking; Sparse Neural Networks; NETWORKS;
D O I
10.1007/978-3-031-33271-5_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we concentrate on the problem of finding a set of neurons in a trained neural network whose pruning leads to a marginal loss in accuracy. To this end, we introduce Optimizing ANN Architectures using Mixed-Integer Programming (OAMIP) to identify critical neurons and prune non-critical ones. The proposed OAMIP uses a Mixed-Integer Program (MIP) to assign importance scores to each neuron in deep neural network architectures. The impact of simultaneous neuron pruning on the main learning tasks guides the neurons' scores. By carefully devising the objective function of the MIP, we drive the solver to minimize the number of critical neurons (i.e., with high importance score) that maintain the overall accuracy of the trained neural network. Our formulation identifies optimized sub-network architectures that generalize across different datasets, a phenomenon known as lottery ticket optimization. This optimized architecture not only performs well on a single dataset but also generalizes across multiple ones upon retraining of network weights. Additionally, we present a scalable implementation of our pruning methodology by decoupling the importance scores across layers using auxiliary networks. Finally, we validate our approach experimentally, showing its ability to generalize on different datasets and architectures.
引用
收藏
页码:219 / 237
页数:19
相关论文
共 50 条