Convergence guarantees for forward gradient descent in the linear regression model

被引:0
作者
Bos, Thijs [1 ]
Schmidt-Hieber, Johannes [2 ]
机构
[1] Leiden Univ, Leiden, Netherlands
[2] Univ Twente, Twente, Netherlands
关键词
Convergence rates; Estimation; Gradient descent; Linear model; Zeroth-order methods; STOCHASTIC-APPROXIMATION; OPTIMAL RATES; OPTIMIZATION;
D O I
10.1016/j.jspi.2024.106174
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Renewed interest in the relationship between artificial and biological neural networks motivates the study of gradient -free methods. Considering the linear regression model with random design, we theoretically analyze in this work the biologically motivated (weight-perturbed) forward gradient scheme that is based on random linear combination of the gradient. If d denotes the number of parameters and k the number of samples, we prove that the mean squared error of this method converges for k & Agrave; d 2 log( d ) with rate d 2 log( d )/ k . Compared to the dimension dependence d for stochastic gradient descent, an additional factor d log( d ) occurs.
引用
收藏
页数:9
相关论文
共 32 条
[1]  
Bach F., 2013, Adv. Neural Inf. Process. Syst., V26
[2]  
Baydin A.G., 2022, arXiv
[3]  
Baydin AG, 2018, J MACH LEARN RES, V18
[4]  
Benveniste A.., 1990, Applications of Mathematics (New York), V22
[5]  
BOS T., 2024, Simulation code: Convergence guarantees for forward gradient descent in the linear regression model
[6]  
Clara G, 2024, Arxiv, DOI arXiv:2306.10529
[7]  
Conn AR, 2009, MOS-SIAM SER OPTIMIZ, V8, P1
[8]   THE RECENT EXCITEMENT ABOUT NEURAL NETWORKS [J].
CRICK, F .
NATURE, 1989, 337 (6203) :129-132
[9]   Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations [J].
Duchi, John C. ;
Jordan, Michael I. ;
Wainwright, Martin J. ;
Wibisono, Andre .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (05) :2788-2806
[10]  
GROSSBERG S, 1987, COGNITIVE SCI, V11, P23, DOI 10.1111/j.1551-6708.1987.tb00862.x