Gradient Learning With the Mode-Induced Loss: Consistency Analysis and Applications

被引:0
作者
Chen, Hong [1 ,2 ]
Fu, Youcheng [1 ,2 ]
Jiang, Xue [3 ]
Chen, Yanhong [4 ]
Li, Weifu [1 ,2 ]
Zhou, Yicong [5 ]
Zheng, Feng [3 ]
机构
[1] Huazhong Agr Univ, Coll Sci, Wuhan 430070, Peoples R China
[2] Minist Educ, Engn Res Ctr Intelligent Technol Agr, Wuhan 430070, Peoples R China
[3] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen 518055, Peoples R China
[4] Chinese Acad Sci, Natl Space Sci Ctr, Beijing 100190, Peoples R China
[5] Univ Macau, Dept Comp & Informat Sci, Macau 999078, Peoples R China
基金
中国国家自然科学基金;
关键词
Input variables; Estimation; Additives; Robustness; Kernel; Optimization; Learning systems; Gradient learning (GL); learning theory; mode-induced loss; Rademacher complexity; variable selection; VARIABLE SELECTION; REGRESSION; CORRENTROPY;
D O I
10.1109/TNNLS.2023.3236345
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Variable selection methods aim to select the key covariates related to the response variable for learning problems with high-dimensional data. Typical methods of variable selection are formulated in terms of sparse mean regression with a parametric hypothesis class, such as linear functions or additive functions. Despite rapid progress, the existing methods depend heavily on the chosen parametric function class and are incapable of handling variable selection for problems where the data noise is heavy-tailed or skewed. To circumvent these drawbacks, we propose sparse gradient learning with the mode-induced loss (SGLML) for robust model-free (MF) variable selection. The theoretical analysis is established for SGLML on the upper bound of excess risk and the consistency of variable selection, which guarantees its ability for gradient estimation from the lens of gradient risk and informative variable identification under mild conditions. Experimental analysis on the simulated and real data demonstrates the competitive performance of our method over the previous gradient learning (GL) methods.
引用
收藏
页码:9686 / 9699
页数:14
相关论文
共 51 条
[1]  
Akaike H., 1998, International Symposium on Information Theory, Budapest, Proceedings, P199, DOI DOI 10.1007/978-1-4612-1694-015
[2]  
Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690
[3]  
Chen H., 2017, PROC ADV NEURAL INF, V30, P1
[4]   Sparse Modal Additive Model [J].
Chen, Hong ;
Wang, Yingjie ;
Zheng, Feng ;
Deng, Cheng ;
Huang, Heng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) :2373-2387
[5]   Kernel-based sparse regression with the correntropy-induced loss [J].
Chen, Hong ;
Wang, Yulong .
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2018, 44 (01) :144-164
[6]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[7]  
Cucker F, 2007, C MO AP C M, P1, DOI 10.1017/CBO9780511618796
[8]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[9]   SEMIPARAMETRIC ESTIMATES OF THE RELATION BETWEEN WEATHER AND ELECTRICITY SALES [J].
ENGLE, RF ;
GRANGER, CWJ ;
RICE, J ;
WEISS, A .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1986, 81 (394) :310-320
[10]   Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models [J].
Fan, Jianqing ;
Feng, Yang ;
Song, Rui .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) :544-557