TRAINING LOGICAL NEURAL NETWORKS BY PRIMAL-DUAL METHODS FOR NEURO-SYMBOLIC REASONING

被引：3

作者：

Lu, Songtao ^{[1
]}

Khan, Naweed ^{[2
]}

Akhalwaya, Ismail Yunus ^{[2
,3
]}

Riegel, Ryan ^{[1
]}

Horesh, Lior ^{[1
]}

Gray, Alexander ^{[1
]}

机构：

[1] IBM Thomas J Waston Res Ctr, IBM Res AI, Yorktown Hts, NY 10598 USA

[2] IBM Res Africa, Johannesburg, South Africa

[3] Univ Witwatersrand, Sch Comp Sci & Appl Math, Johannesburg, South Africa

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

ALTERNATING DIRECTION METHOD; CONVERGENCE ANALYSIS; NONCONVEX; COMPLEXITY;

D O I：

10.1109/ICASSP39728.2021.9415044

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Parametrized machine learning models for inference often include non-linear and nonconvex constraints over the parameters and meta-parameters. Training these models to convergence is in general difficult, and naive methods such as projected gradient descent or grid search are not easily able to enforce the functional constraints. This work explores the optimization of a constrained neural network (familiar from machine learning but with parameter constraints), in the service of neuro-symbolic logical reasoning. Logical neural networks (LNNs) provide a well-justified, interpretable example of training under non-trivial constraints. In this paper, we propose a unified framework for solving this nonlinear programming problem by leveraging primal-dual optimization methods, and quantify the corresponding convergence rate to the Karush-Kuhn-Tucker (KKT) points of this problem. Extensive numerical results on both a toy example and training an LNN over real datasets validate the efficacy of the method.

引用

页码：5559 / 5563

页数：5

共 31 条

[1]

Bach SH, 2017, J MACH LEARN RES, V18

[2]

Bertsekas D. P., 1998, NONLINEAR PROGRAMMIN

[3] Distributed optimization and statistical learning via the alternating direction method of multipliers [J].

Boyd S. ;

Parikh N. ;

Chu E. ;

Peleato B. ;

Eckstein J. .

Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122

[4] ON THE EVALUATION COMPLEXITY OF COMPOSITE FUNCTION MINIMIZATION WITH APPLICATIONS TO NONCONVEX NONLINEAR PROGRAMMING [J].

Cartis, Coralia ;

Gould, Nicholas I. M. ;

Toint, Philippe L. .

SIAM JOURNAL ON OPTIMIZATION, 2011, 21 (04) :1721-1739

[5]

Domingos M S. P., 2010, The alchemy tutorial

[6]

Dua D., 2017, UCI machine learning repository

[7]

Fagin R., 2020, ARXIV200802429

[8]

Garcez AD, 2019, J APPL LOG-IFCOLOG, V6, P611

[9]

Hestenes M. R., 1969, Journal of Optimization Theory and Applications, V4, P303, DOI 10.1007/BF00927673

[10] On the linear convergence of the alternating direction method of multipliers [J].

Hong, Mingyi ;

Luo, Zhi-Quan .

MATHEMATICAL PROGRAMMING, 2017, 162 (1-2) :165-199

← 1 2 3 4 →