Distributed smoothed rank regression with heterogeneous errors for massive data

被引:0
作者
Xiaohui Yuan
Xinran Zhang
Yue Wang
Chunjie Wang
机构
[1] Changchun University of Technology,School of Mathematics and Statistics
来源
Journal of the Korean Statistical Society | 2023年 / 52卷
关键词
Heterogeneous error; Massive data; Variable selection; Weighted rank estimator;
D O I
暂无
中图分类号
学科分类号
摘要
Rank estimation methods are robust and highly efficient for estimating linear regression model. This paper investigates the rank regression estimation for massive data. To deal with the situation that the data are distributed heterogeneously in different blocks, we propose a weighted distributed rank-based estimator for massive data, which can improve the efficiency of the standard divide and conquer estimator. Under mild conditions, the asymptotic distributions of the weighted distributed rank-based estimator is derived. To achieve sparsity with high-dimensional covariates, the variable selection procedure is also proposed. Both simulations and data analysis are included to illustrate the finite sample performance of the proposed methods.
引用
收藏
页码:1078 / 1103
页数:25
相关论文
共 88 条
  • [1] Balakrishnan S(2008)Algorithms for sparse linear classifiers in the massive data setting Journal of Machine Learning Research 9 313-337
  • [2] Madigan D(2015)Semi-parametric rank regression with missing responses Journal of Multivariate Analysis 142 117-132
  • [3] Bindele HF(2005)Standard errors and covariance matrices for smoothed rank estimators Biometrika 92 149-158
  • [4] Abebe A(2020)Quantile regression in big data: A divide and conquer based strategy Computational Statistics & Data Analysis 144 106892-1684
  • [5] Brown BM(2014)A split-and-conquer approach for analysis of extraordinarily large data Statistica Sinica 24 1655-911
  • [6] Wang YG(2008)Sure independence screening for ultrahigh dimensional feature space Journal of the Royal Statistical Society, Ser B 70 849-1853
  • [7] Chen L(2009)Ultrahigh dimensional feature selection: Beyond the linear model Journal of Machine Learning Research 10 1829-3604
  • [8] Zhou Y(2010)Sure independence screening in generalized linear models with NP-dimensionality The Annals of Statistics 38 3567-1698
  • [9] Chen X(2023)Distributed adaptive lasso penalized generalized linear models for big data Communications in Statistics-Simulation and Computation 52 1679-565
  • [10] Xie M(2021)Penalized quantile regression for distributed big data using the slack variable representation Journal of Computational and Graphical Statistics 30 557-1082