On the Omnipresence of Spurious Local Minima in Certain Neural Network Training Problems

被引:2
|
作者
Christof, Constantin [1 ]
Kowalczyk, Julia [1 ]
机构
[1] Tech Univ Munich, Chair Optimal Control, Ctr Math Sci, Boltzmannstr 3, D-85748 Garching, Germany
关键词
Deep artificial neural network; Spurious local minimum; Training problem; Loss landscape; Hadamard well-posedness; Best approximation; Stability analysis; Local affine linearity; APPROXIMATION; LANDSCAPE;
D O I
10.1007/s00365-023-09658-w
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We study the loss landscape of training problems for deep artificial neural networks with a one-dimensional real output whose activation functions contain an affine segment and whose hidden layers have width at least two. It is shown that such problems possess a continuum of spurious (i.e., not globally optimal) local minima for all target functions that are not affine. In contrast to previous works, our analysis covers all sampling and parameterization regimes, general differentiable loss functions, arbitrary continuous nonpolynomial activation functions, and both the finite-and infinite dimensional setting. It is further shown that the appearance of the spurious local minima in the considered training problems is a direct consequence of the universal approximation theorem and that the underlying mechanisms also cause, e.g., L-P-best approximation problems to be ill-posed in the sense of Hadamard for all networks that do not have a dense image. The latter result also holds without the assumption of local affine linearity and without any conditions on the hidden layers.
引用
收藏
页码:197 / 224
页数:28
相关论文
共 50 条
  • [1] Neural network training without spurious minima
    Diambra, L
    Plastino, A
    PHYSICAL REVIEW E, 1996, 53 (05): : 5190 - 5193
  • [2] Local minima free neural network learning
    Jordanov, IN
    Rafik, TA
    2004 2ND INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 34 - 39
  • [3] Spurious Local Minima Are Common for Deep Neural Networks With Piecewise Linear Activations
    Liu, Bo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5382 - 5394
  • [4] Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
    Safran, Itay
    Shamir, Ohad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [5] No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis
    Ge, Rong
    Jin, Chi
    Zheng, Yi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [6] CERTIFYING THE ABSENCE OF SPURIOUS LOCAL MINIMA AT INFINITY
    Josz, Cedric
    Li, Xiaopeng
    SIAM JOURNAL ON OPTIMIZATION, 2023, 33 (03) : 1416 - 1439
  • [7] On the Absence of Spurious Local Minima in Nonlinear Low-Rank Matrix Recovery Problems
    Bi, Yingjie
    Lavaei, Javad
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 379 - +
  • [8] A Neural Network for Tornado Diagnosis: Managing Local Minima
    C. Marzban
    Neural Computing & Applications, 2000, 9 : 133 - 141
  • [9] Dynamics and local minima of a simple neural network for optimization
    Tsutsumi, K
    Nakajima, K
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 353 - 358
  • [10] A neural network for tornado diagnosis: Managing local minima
    Marzban, C
    NEURAL COMPUTING & APPLICATIONS, 2000, 9 (02): : 133 - 141