Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity

被引：2

作者：

Xu, Shiyun ^{[1
]}

Bu, Zhiqi ^{[1
]}

Chaudhari, Pratik ^{[2
]}

Barnett, Ian J. ^{[3
]}

机构：

[1] Univ Penn, Dept Appl Math & Computat Sci, Philadelphia, PA 19104 USA

[2] Univ Penn, Dept Elect & Syst Engn, Philadelphia, PA 19104 USA

[3] Univ Penn, Dept Biostat Epidemiol & Informat, Philadelphia, PA 19104 USA

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT III | 2023年 / 14171卷

基金：

美国国家科学基金会;

关键词：

Interpretability; Additive Models; Group LASSO; Feature Selection; VARIABLE SELECTION; LASSO; REGRESSION; SHRINKAGE;

D O I：

10.1007/978-3-031-43418-1_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse neural additive models (SNAM) that employ the group sparsity regularization (e.g. Group LASSO), where each feature is learned by a sub-network whose trainable parameters are clustered as a group. We study the theoretical properties for SNAM with novel techniques to tackle the non-parametric truth, thus extending from classical sparse linear models such as the LASSO, which only works on the parametric truth. Specifically, we show that SNAM with subgradient and proximal gradient descents provably converges to zero training loss as t -> infinity, and that the estimation error of SNAM vanishes asymptotically as n -> infinity. We also prove that SNAM, similar to LASSO, can have exact support recovery, i.e. perfect feature selection, with appropriate regularization. Moreover, we show that the SNAM can generalize well and preserve the 'identifiability', recovering each feature's effect. We validate our theories via extensive experiments and further testify to the good accuracy and efficiency of SNAM (Appendix can be found at https://arxiv.org/abs/2202.12482.).

引用

页码：343 / 359

页数：17

共 50 条

[1] Group Selection and Shrinkage: Structured Sparsity for Semiparametric Additive Models
Thompson, Ryan
Vahid, Farshid
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2024, 33 (04) : 1286 - 1297
[2] Structured Sparsity of Convolutional Neural Networks via Nonconvex Sparse Group Regularization
Bui, Kevin
Park, Fredrick
Zhang, Shuai
Qi, Yingyong
Xin, Jack
FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2021, 6
[3] Deep Feature Selection using an Enhanced Sparse Group Lasso Algorithm
Farokhmanesh, Fatemeh
Sadeghi, Mohammad Taghi
2019 27TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2019), 2019, : 1549 - 1552
[4] Heterogeneous Feature Selection With Multi-Modal Deep Neural Networks and Sparse Group LASSO
Zhao, Lei
Hu, Qinghua
Wang, Wenwu
IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1936 - 1948
[5] Efficient nonconvex sparse group feature selection via continuous and discrete optimization
Xiang, Shuo
Shen, Xiaotong
Ye, Jieping
ARTIFICIAL INTELLIGENCE, 2015, 224 : 28 - 50
[6] Multi-label feature selection via feature manifold learning and sparsity regularization
Cai, Zhiling
Zhu, William
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (08) : 1321 - 1334
[7] Unsupervised feature selection via joint local learning and group sparse regression
Yue WU
Can WANG
Yue-qing ZHANG
Jia-jun BU
FrontiersofInformationTechnology&ElectronicEngineering, 2019, 20 (04) : 538 - 553
[8] Unsupervised feature selection via joint local learning and group sparse regression
Yue Wu
Can Wang
Yue-qing Zhang
Jia-jun Bu
Frontiers of Information Technology & Electronic Engineering, 2019, 20 : 538 - 553
[9] Unsupervised feature selection via joint local learning and group sparse regression
Wu, Yue
Wang, Can
Zhang, Yue-qing
Bu, Jia-jun
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2019, 20 (04) : 538 - 553
[10] Local adaptive learning for semi-supervised feature selection with group sparsity
Zeng, ZhiQiang
Wang, Xiaodong
Yan, Fei
Chen, Yuming
KNOWLEDGE-BASED SYSTEMS, 2019, 181

← 1 2 3 4 5 →