Spurious Local Minima Are Common for Deep Neural Networks With Piecewise Linear Activations

被引:1
|
作者
Liu, Bo [1 ]
机构
[1] Beijing Univ Technol, Coll Comp Sci, Fac Informat Technol, Beijing 100124, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Neural networks; Biological neural networks; Neurons; Training; Minimization; Matrix decomposition; Convolutional neural networks (CNNs); deep learning theory; deep neural networks; local minima; loss landscape; MULTISTABILITY;
D O I
10.1109/TNNLS.2022.3204319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, theoretically, it is shown that spurious local minima are common for deep fully connected networks and average-pooling convolutional neural networks (CNNs) with piecewise linear activations and datasets that cannot be fit by linear models. Motivating examples are given to explain why spurious local minima exist: each output neuron of deep fully connected networks and CNNs with piecewise linear activations produces a continuous piecewise linear (CPWL) function, and different pieces of the CPWL output can optimally fit disjoint groups of data samples when minimizing the empirical risk. Fitting data samples with different CPWL functions usually results in different levels of empirical risk, leading to the prevalence of spurious local minima. The results are proved in general settings with arbitrary continuous loss functions and general piecewise linear activations. The main proof technique is to represent a CPWL function as maximization over minimization of linear pieces. Deep networks with piecewise linear activations are then constructed to produce these linear pieces and implement the maximization over minimization operation.
引用
收藏
页码:5382 / 5394
页数:13
相关论文
共 50 条
  • [21] Finite convergence of an active signature method to local minima of piecewise linear functions
    Griewank, Andreas
    Walther, Andrea
    OPTIMIZATION METHODS & SOFTWARE, 2019, 34 (05): : 1035 - 1055
  • [22] Navigating Local Minima in Quantized Spiking Neural Networks
    Eshraghian, Jason K.
    Lammie, Corey
    Azghadi, Mostafa Rahimi
    Lu, Wei D.
    2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 352 - 355
  • [23] Exponentially Many Local Minima in Quantum Neural Networks
    You, Xuchen
    Wu, Xiaodi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [24] Networks of piecewise linear neural mass models
    Coombes, S.
    Lai, Y. M.
    Sayli, M.
    Thul, R.
    EUROPEAN JOURNAL OF APPLIED MATHEMATICS, 2018, 29 (05) : 869 - 890
  • [25] On the number of regions of piecewise linear neural networks
    Goujon, Alexis
    Etemadi, Arian
    Unser, Michael
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2024, 441
  • [26] Convergent design of piecewise linear neural networks
    Chandrasekaran, Hema
    Li, Jiang
    Delashmit, W. H.
    Narasimha, P. L.
    Yu, Changhua
    Manry, Michael T.
    NEUROCOMPUTING, 2007, 70 (4-6) : 1022 - 1039
  • [27] Local minima found in the subparameter space can be effective for ensembles of deep convolutional neural networks
    Yang, Yongquan
    Lv, Haijun
    Chen, Ning
    Wu, Yang
    Zheng, Jiayi
    Zheng, Zhongxi
    PATTERN RECOGNITION, 2021, 109
  • [28] Learning continuous piecewise non-linear activation functions for deep neural networks
    Gao, Xinchen
    Li, Yawei
    Li, Wen
    Duan, Lixin
    Van Gool, Luc
    Benini, Luca
    Magno, Michele
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1835 - 1840
  • [29] Analysis on the Number of Linear Regions of Piecewise Linear Neural Networks
    Hu, Qiang
    Zhang, Hao
    Gao, Feifei
    Xing, Chengwen
    An, Jianping
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 644 - 653
  • [30] Training Deep Photonic Convolutional Neural Networks With Sinusoidal Activations
    Passalis, Nikolaos
    Mourgias-Alexandris, George
    Tsakyridis, Apostolos
    Pleros, Nikos
    Tefas, Anastasios
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (03): : 384 - 393