Convex-Concave Programming: An Effective Alternative for Optimizing Shallow Neural Networks

被引：0

作者：

Askarizadeh, Mohammad ^{[1
]}

Morsali, Alireza ^{[2
]}

Tofigh, Sadegh ^{[1
]}

Nguyen, Kim Khoa ^{[1
]}

机构：

[1] Univ Quebec, Ecole Technol Super, Dept Elect & Comp Engn, Montreal, PQ H3C 1K3, Canada

[2] Aibrary, R&D, Burnaby, BC V5C 3C5, Canada

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2025年 / 9卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Neural network; difference convex function; convex-concave algorithm; convergence rate; complexity analysis; FEEDFORWARD NETWORKS; APPROXIMATION; OPTIMIZATION; ALGORITHMS;

D O I：

10.1109/TETCI.2024.3502463

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this study, we address the challenges of non-convex optimization in neural networks (NNs) by formulating the training of multilayer perceptron (MLP) NNs as a difference of convex functions (DC) problem. Utilizing the basic convex-concave algorithm to solve our DC problems, we introduce two alternative optimization techniques, DC-GD and DC-OPT , for determining MLP parameters. By leveraging the non-uniqueness property of the convex components in DC functions, we generate strongly convex components for the DC NN cost function. This strong convexity enables our proposed algorithms, DC-GD and DC-OPT , to achieve an iteration complexity of O(log(1/epsilon)) , surpassing that of other solvers, such as stochastic gradient descent ( SGD ), which has an iteration complexity of O(1/epsilon) . This improvement raises the convergence rate from sublinear ( SGD ) to linear (ours) while maintaining comparable total computational costs . Furthermore, conventional NN optimizers like SGD , RMSprop , and Adam are highly sensitive to the learning rate, adding computational overhead for practitioners in selecting an appropriate learning rate. In contrast, our DC-OPT algorithm is hyperparameter-free (i.e., it requires no learning rate), and our DC-GD algorithm is less sensitive to the learning rate, offering comparable accuracy to other solvers. Additionally, we extend our approach to a convolutional NN architecture, demonstrating its applicability to modern NNs. We evaluate the performance of our proposed algorithms by comparing them to conventional optimizers such as SGD , RMSprop , and Adam across various test cases. The results suggest that our approach is a viable alternative for optimizing shallow MLP NNs.

引用

页码：2894 / 2907

页数：14

共 63 条

[1] On the Rate of Convergence of the Difference-of-Convex Algorithm (DCA) [J].

Abbaszadehpeivasti, Hadi ;

de Klerk, Etienne ;

Zamani, Moslem .

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2024, 202 (01) :475-496

[2] DC decomposition of nonconvex polynomials with algebraic techniques [J].

Ahmadi, Amir Ali ;

Hall, Georgina .

MATHEMATICAL PROGRAMMING, 2018, 169 (01) :69-94

[3]

Allen-Zhu Z, 2019, PR MACH LEARN RES, V97

[4]

Amos Brandon, 2017, P MACHINE LEARNING R, V70

[5] A branch and bound method via dc optimization algorithms and ellipsoidal technique for box constrained nonconvex quadratic problems [J].

An, LTH ;

Tao, PD .

JOURNAL OF GLOBAL OPTIMIZATION, 1998, 13 (02) :171-206

[6]

Argyriou A., 2006, International Conference on Machine Learning, P41

[7] RISC-V Barrel Processor for Deep Neural Network Acceleration [J].

AskariHemmat, MohammadHossein ;

Bilaniuk, Olexa ;

Wagner, Sean ;

Savaria, Yvon ;

David, Jean-Pierre .

2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,

[8] Difference Convex (DC) Programming Approach as an Alternative Optimizer for Neural Networks [J].

Askarizadeh, Mohammad ;

Morsali, Alireza ;

Zangiabadi, Mostafa ;

Nguyen, Kim Khoa .

ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, :5179-5184

[9] DC-programming for neural network optimizations [J].

Awasthi, Pranjal ;

Mao, Anqi ;

Mohri, Mehryar ;

Zhong, Yutao .

JOURNAL OF GLOBAL OPTIMIZATION, 2024,

[10] Efficient Global Optimization of Two-Layer ReLU Networks: Quadratic-Time Algorithms and Adversarial Training [J].

Bai, Yatong ;

Gautam, Tanmay ;

Sojoudi, Somayeh .

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2023, 5 (02) :446-474

← 1 2 3 4 5 6 7 →