Empirical likelihood ratio tests for non-nested model selection based on predictive losses

被引:0
作者
Jiang, Jiancheng [1 ,2 ]
Jiang, Xuejun [3 ]
Wang, Haofeng [3 ]
机构
[1] Univ N Carolina, Dept Math & Stat, Charlotte, NC 28223 USA
[2] Univ N Carolina, Sch Data Sci, Charlotte, NC 28223 USA
[3] Southern Univ Sci & Technol, Dept Stat & Data Sci, Shenzhen 518055, Peoples R China
关键词
Cross-validation; nonparametric smoothing; scalable distributed test; COEFFICIENT REGRESSION-MODELS; SEPARATE FAMILIES; INFERENCES;
D O I
10.3150/23-BEJ1640
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose an empirical likelihood ratio (ELR) test for comparing any two supervised learning models, which may be nested, non -nested, overlapping, misspecified, or correctly specified. The test compares the prediction losses of models based on the cross -validation. We determine the asymptotic null and alternative distributions of the ELR test for comparing two nonparametric learning models under a general framework of convex loss functions. However, the prediction losses from the cross -validation involve repeatedly fitting the models with one observation left out, which leads to a heavy computational burden. We introduce an easy -to -implement ELR test which requires fitting the models only once and shares the same asymptotics as the original one. The proposed tests are applied to compare additive models with varying -coefficient models. Furthermore, a scalable distributed ELR test is proposed for testing the importance of a group of variables in possibly misspecified additive models with massive data. Simulations show that the proposed tests work well and have favorable finite -sample performance compared to some existing approaches. The methodology is validated on an empirical application.
引用
收藏
页码:1458 / 1481
页数:24
相关论文
共 52 条
[1]   DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS [J].
Battey, Heather ;
Fan, Jianqing ;
Liu, Han ;
Lu, Junwei ;
Zhu, Ziwei .
ANNALS OF STATISTICS, 2018, 46 (03) :1352-1382
[2]   Some new asymptotic theory for least squares series: Pointwise and uniform results [J].
Belloni, Alexandre ;
Chernozhukov, Victor ;
Chetverikov, Denis ;
Kato, Kengo .
JOURNAL OF ECONOMETRICS, 2015, 186 (02) :345-366
[3]   Distributed optimization and statistical learning via the alternating direction method of multipliers [J].
Boyd S. ;
Parikh N. ;
Chu E. ;
Peleato B. ;
Eckstein J. .
Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122
[4]   Functional-coefficient regression models for nonlinear time series [J].
Cai, ZW ;
Fan, JQ ;
Yao, QW .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2000, 95 (451) :941-956
[5]   High-dimensional empirical likelihood inference [J].
Chang, Jinyuan ;
Chen, Song Xi ;
Tang, Cheng Yong ;
Wu, Tong Tong .
BIOMETRIKA, 2021, 108 (01) :127-147
[6]   LOCAL INDEPENDENCE FEATURE SCREENING FOR NONPARAMETRIC AND SEMIPARAMETRIC MODELS BY MARGINAL EMPIRICAL LIKELIHOOD [J].
Chang, Jinyuan ;
Tang, Cheng Yong ;
Wu, Yichao .
ANNALS OF STATISTICS, 2016, 44 (02) :515-539
[7]   DISTRIBUTED STATISTICAL INFERENCE FOR MASSIVE DATA [J].
Chen, Song Xi ;
Peng, Liuhua .
ANNALS OF STATISTICS, 2021, 49 (05) :2851-2869
[8]   A review on empirical likelihood methods for regression [J].
Chen, Song Xi ;
Van Keilegom, Ingrid .
TEST, 2009, 18 (03) :415-447
[9]  
CHEN WWS, 1981, BIOMETRICS, V37, P611
[10]   QUANTILE REGRESSION UNDER MEMORY CONSTRAINT [J].
Chen, Xi ;
Liu, Weidong ;
Zhang, Yichen .
ANNALS OF STATISTICS, 2019, 47 (06) :3244-3273