Doubly Distributed Supervised Learning and Inference with High-Dimensional Correlated Outcomes

被引:0
|
作者
Hector, Emily C. [1 ]
Song, Peter X-K [2 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48104 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Divide-and-conquer; Generalized method of moments; Estimating functions; Parallel computing; Scalable computing; LIKELIHOOD ESTIMATION; QUASI-LIKELIHOOD; REGRESSION; STATISTICS; BINARY; MODELS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a unified framework for supervised learning and inference procedures using the divide-and-conquer approach for high-dimensional correlated outcomes. We propose a general class of estimators that can be implemented in a fully distributed and parallelized computational scheme. Modeling, computational and theoretical challenges related to high-dimensional correlated outcomes are overcome by dividing data at both outcome and subject levels, estimating the parameter of interest from blocks of data using a broad class of supervised learning procedures, and combining block estimators in a closed-form meta-estimator asymptotically equivalent to estimates obtained by Hansen (1982)'s generalized method of moments (GMM) that does not require the entire data to be reloaded on a common server. We provide rigorous theoretical justifications for the use of distributed estimators with correlated outcomes by studying the asymptotic behaviour of the combined estimator with fixed and diverging number of data divisions. Simulations illustrate the finite sample performance of the proposed method, and we provide an R package for ease of implementation.
引用
收藏
页数:35
相关论文
共 50 条
  • [21] HIGH-DIMENSIONAL INFERENCE FOR DYNAMIC TREATMENT EFFECTS
    Bradic, Jelena
    Ji, Weijie
    Zhang, Yuqian
    ANNALS OF STATISTICS, 2024, 52 (02) : 415 - 440
  • [22] Cross-Dimensional Inference of Dependent High-Dimensional Data
    Desai, Keyur H.
    Storey, John D.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2012, 107 (497) : 135 - 151
  • [23] Inference for low- and high-dimensional inhomogeneous Gibbs point processes
    Ba, Ismaila
    Coeurjolly, Jean-Francois
    SCANDINAVIAN JOURNAL OF STATISTICS, 2023, 50 (03) : 993 - 1021
  • [24] Double-Estimation-Friendly Inference for High-Dimensional Misspecified Models
    Shah, Rajen D.
    Buhlmann, Peter
    ENERGY AND BUILDINGS, 2023, 282 : 68 - 91
  • [25] Non-negative Constrained Penalty for High-Dimensional Correlated Data
    Ming, Hao
    Chen, Yinjun
    Yang, Hu
    COMMUNICATIONS IN MATHEMATICS AND STATISTICS, 2025,
  • [26] Supervised Reconstruction for High-Dimensional Expensive Multiobjective Optimization
    Li, Hongbin
    Lin, Jianqing
    Chen, Qing
    He, Cheng
    Pan, Linqiang
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (02): : 1814 - 1827
  • [27] Inference in regression discontinuity designs with high-dimensional covariates
    Kreiss, Alexander
    Rothe, C.
    ECONOMETRICS JOURNAL, 2023, 26 (02) : 105 - 123
  • [28] Robust inference for high-dimensional single index models
    Han, Dongxiao
    Han, Miao
    Huang, Jian
    Lin, Yuanyuan
    SCANDINAVIAN JOURNAL OF STATISTICS, 2023, 50 (04) : 1590 - 1615
  • [29] High-dimensional robust inference for censored linear models
    Huang, Jiayu
    Wu, Yuanshan
    SCIENCE CHINA-MATHEMATICS, 2024, 67 (04) : 891 - 918
  • [30] Likelihood-Free Inference in High-Dimensional Models
    Kousathanas, Athanasios
    Leuenberger, Christoph
    Helfer, Jonas
    Quinodoz, Mathieu
    Foll, Matthieu
    Wegmann, Daniel
    GENETICS, 2016, 203 (02) : 893 - +