Doubly Distributed Supervised Learning and Inference with High-Dimensional Correlated Outcomes

被引:0
|
作者
Hector, Emily C. [1 ]
Song, Peter X-K [2 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48104 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Divide-and-conquer; Generalized method of moments; Estimating functions; Parallel computing; Scalable computing; LIKELIHOOD ESTIMATION; QUASI-LIKELIHOOD; REGRESSION; STATISTICS; BINARY; MODELS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a unified framework for supervised learning and inference procedures using the divide-and-conquer approach for high-dimensional correlated outcomes. We propose a general class of estimators that can be implemented in a fully distributed and parallelized computational scheme. Modeling, computational and theoretical challenges related to high-dimensional correlated outcomes are overcome by dividing data at both outcome and subject levels, estimating the parameter of interest from blocks of data using a broad class of supervised learning procedures, and combining block estimators in a closed-form meta-estimator asymptotically equivalent to estimates obtained by Hansen (1982)'s generalized method of moments (GMM) that does not require the entire data to be reloaded on a common server. We provide rigorous theoretical justifications for the use of distributed estimators with correlated outcomes by studying the asymptotic behaviour of the combined estimator with fixed and diverging number of data divisions. Simulations illustrate the finite sample performance of the proposed method, and we provide an R package for ease of implementation.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] A Distributed and Integrated Method of Moments for High-Dimensional Correlated Data Analysis
    Hector, Emily C.
    Song, Peter X-K
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (534) : 805 - 818
  • [2] Online Variational Bayes Inference for High-Dimensional Correlated Data
    Kabisa, Sylvie
    Dunson, David B.
    Morris, Jeffrey S.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2016, 25 (02) : 426 - 444
  • [3] DOUBLY DEBIASED LASSO: HIGH-DIMENSIONAL INFERENCE UNDER HIDDEN CONFOUNDING
    Guo, Zijian
    Cevid, Domagoj
    Buhlmann, Peter
    ANNALS OF STATISTICS, 2022, 50 (03) : 1320 - 1347
  • [4] Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning
    Deng, Siyi
    Ning, Yang
    Zhao, Jiwei
    Zhang, Heping
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (548) : 2748 - 2759
  • [5] High-dimensional Simultaneous Inference of Quantiles
    Lou, Zhipeng
    Wu, Wei Biao
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2025,
  • [6] Inference for High-Dimensional Exchangeable Arrays
    Chiang, Harold D.
    Kato, Kengo
    Sasaki, Yuya
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (543) : 1595 - 1605
  • [7] High-dimensional Bayesian inference via the unadjusted Langevin algorithm
    Durmus, Alain
    Moulines, Eric
    BERNOULLI, 2019, 25 (4A) : 2854 - 2882
  • [8] Rejoinder on: High-dimensional simultaneous inference with the bootstrap
    Dezeure, Ruben
    Bhlmann, Peter
    Zhang, Cun-Hui
    TEST, 2017, 26 (04) : 751 - 758
  • [9] Inference of Breakpoints in High-dimensional Time Series
    Chen, Likai
    Wang, Weining
    Wu, Wei Biao
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (540) : 1951 - 1963
  • [10] Doubly robust semiparametric inference using regularized calibrated estimation with high-dimensional data
    Ghosh, Sat Yajit
    Tan, Zhiqiang
    BERNOULLI, 2022, 28 (03) : 1675 - 1703