The Contextual Lasso: Sparse Linear Models via Deep Neural Networks
被引:0
|
作者:
Thompson, Ryan
论文数: 0引用数: 0
h-index: 0
机构:
Univ New South Wales, Sydney, NSW, Australia
CSIROs Data61, Eveleigh, AustraliaUniv New South Wales, Sydney, NSW, Australia
Thompson, Ryan
[1
,2
]
Dezfouli, Amir
论文数: 0引用数: 0
h-index: 0
机构:
BIMLOGIQ, Sydney, NSW, AustraliaUniv New South Wales, Sydney, NSW, Australia
Dezfouli, Amir
[3
]
Kohn, Robert
论文数: 0引用数: 0
h-index: 0
机构:
Univ New South Wales, Sydney, NSW, AustraliaUniv New South Wales, Sydney, NSW, Australia
Kohn, Robert
[1
]
机构:
[1] Univ New South Wales, Sydney, NSW, Australia
[2] CSIROs Data61, Eveleigh, Australia
[3] BIMLOGIQ, Sydney, NSW, Australia
来源:
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023)
|
2023年
关键词:
REGRESSION;
REGULARIZATION;
SELECTION;
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Sparse linear models are one of several core tools for interpretable machine learning, a field of emerging importance as predictive models permeate decision-making in many domains. Unfortunately, sparse linear models are far less flexible as functions of their input features than black-box models like deep neural networks. With this capability gap in mind, we study a not-uncommon situation where the input features dichotomize into two groups: explanatory features, which are candidates for inclusion as variables in an interpretable model, and contextual features, which select from the candidate variables and determine their effects. This dichotomy leads us to the contextual lasso, a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features. The fitting process learns this function nonparametrically via a deep neural network. To attain sparse coefficients, we train the network with a novel lasso regularizer in the form of a projection layer that maps the network's output onto the space of l(1)-constrained linear models. An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso without sacrificing the predictive power of a standard deep neural network.
机构:
Univ Electrocommun, Grad Sch Informat & Engn, 1-5-1 Chofugaoka, Tokyo 1828585, JapanUniv Electrocommun, Grad Sch Informat & Engn, 1-5-1 Chofugaoka, Tokyo 1828585, Japan
Okazaki, Akira
Kawano, Shuichi
论文数: 0引用数: 0
h-index: 0
机构:
Kyushu Univ, Fac Math, 744 Motooka,Nishi-Ku, Fukuoka 8190395, JapanUniv Electrocommun, Grad Sch Informat & Engn, 1-5-1 Chofugaoka, Tokyo 1828585, Japan
机构:
Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R ChinaHong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
Li, Xinxin
Mo, Lili
论文数: 0引用数: 0
h-index: 0
机构:
United Int Coll, Div Sci & Technol, Zhuhai, Peoples R ChinaHong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
Mo, Lili
Yuan, Xiaoming
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R ChinaHong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
Yuan, Xiaoming
Zhang, Jianzhong
论文数: 0引用数: 0
h-index: 0
机构:
United Int Coll, Div Sci & Technol, Zhuhai, Peoples R ChinaHong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
机构:
Peking Univ, Sch Math Sci, Beijing, Peoples R China
Peking Univ, Sch Math Sci, Beijing 100871, Peoples R ChinaPeking Univ, Sch Math Sci, Beijing, Peoples R China
Sun, Yang
Fang, Xiangzhong
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Sch Math Sci, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Beijing, Peoples R China