Interplay between depth and width for interpolation in neural ODEs

被引:1
|
作者
Alvarez-Lopez, Antonio [1 ,3 ]
Slimane, Arselane Hadj [2 ]
Zuazua, Enrique [1 ,3 ,4 ]
机构
[1] Univ Autonoma Madrid, Dept Ingn Quim, C Francisco Tomas & Valiente 7, 28049 Madrid, Spain
[2] ENS Paris Saclay, 4 Ave Sci, F-91190 Gif Sur Yvette, France
[3] Friedrich Alexander Univ Erlangen Nurnberg, Chair Dynam Control Machine Learning & Numer Alexa, Dept Math, Cauerstr 11, D-91058 Erlangen, Germany
[4] Fdn Deusto, Ave Univ 24, Bilbao 48007, Spain
关键词
Neural ODEs; Depth; Width; Simultaneous controllability; Transport control; Wasserstein distance;
D O I
10.1016/j.neunet.2024.106640
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural ordinary differential equations have emerged as a natural tool for supervised learning from a control perspective, yet a complete understanding of the role played by their architecture remains elusive. In this work, we examine the interplay between the width p and the number of transitions between layers L (corresponding to a depth of L+1). Specifically, we construct explicit controls interpolating either a finite dataset D, comprising N pairs of points in R-d, or two probability measures within a Wasserstein error margin epsilon>0. Our findings reveal a balancing trade-off between p and L, with L scaling as 1+O(N/p) for data interpolation, and as 1+O(p(-1)+(1+p)(-1)epsilon(-d)) for measures. In the high-dimensional and wide setting where d, p > N, our result can be refined to achieve L=0. This naturally raises the problem of data interpolation in the autonomous regime, characterized by L=0. We adopt two alternative approaches: either controlling in a probabilistic sense, or by relaxing the target condition. In the first case, when p = N we develop an inductive control strategy based on a separability assumption whose probability increases with d. In the second one, we establish an explicit error decay rate with respect to p which results from applying a universal approximation theorem to a custom-built Lipschitz vector field interpolating D
引用
收藏
页数:14
相关论文
共 50 条