Doubly Distributed Supervised Learning and Inference with High-Dimensional Correlated Outcomes

被引:0
作者
Hector, Emily C. [1 ]
Song, Peter X-K [2 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48104 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Divide-and-conquer; Generalized method of moments; Estimating functions; Parallel computing; Scalable computing; LIKELIHOOD ESTIMATION; QUASI-LIKELIHOOD; REGRESSION; STATISTICS; BINARY; MODELS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a unified framework for supervised learning and inference procedures using the divide-and-conquer approach for high-dimensional correlated outcomes. We propose a general class of estimators that can be implemented in a fully distributed and parallelized computational scheme. Modeling, computational and theoretical challenges related to high-dimensional correlated outcomes are overcome by dividing data at both outcome and subject levels, estimating the parameter of interest from blocks of data using a broad class of supervised learning procedures, and combining block estimators in a closed-form meta-estimator asymptotically equivalent to estimates obtained by Hansen (1982)'s generalized method of moments (GMM) that does not require the entire data to be reloaded on a common server. We provide rigorous theoretical justifications for the use of distributed estimators with correlated outcomes by studying the asymptotic behaviour of the combined estimator with fixed and diverging number of data divisions. Simulations illustrate the finite sample performance of the proposed method, and we provide an R package for ease of implementation.
引用
收藏
页数:35
相关论文
共 50 条
  • [31] Estimator augmentation with applications in high-dimensional group inference
    Zhou, Qing
    Min, Seunghyun
    ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 3039 - 3080
  • [32] Structural inference in sparse high-dimensional vector autoregressions
    Krampe, J.
    Paparoditis, E.
    Trenkler, C.
    JOURNAL OF ECONOMETRICS, 2023, 234 (01) : 276 - 300
  • [33] Hierarchical Testing in the High-Dimensional Setting With Correlated Variables
    Mandozzi, Jacopo
    Buhlmann, Peter
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 331 - 343
  • [34] Biologically inspired incremental learning for high-dimensional spaces
    Gepperth, Alexander
    Hecht, Thomas
    Lefort, Mathieu
    Koerner, Ursula
    5TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND ON EPIGENETIC ROBOTICS (ICDL-EPIROB), 2015, : 269 - 275
  • [35] Machine learning for high-dimensional dynamic stochastic economies
    Scheidegger, Simon
    Bilionis, Ilias
    JOURNAL OF COMPUTATIONAL SCIENCE, 2019, 33 : 68 - 82
  • [36] Stochastic parallel extreme artificial hydrocarbon networks: An implementation for fast and robust supervised machine learning in high-dimensional data
    Ponce, Hiram
    de Campos Souza, Paulo V.
    Guimaraes, Augusto Junio
    Gonzalez-Mora, Guillermo
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 89
  • [37] High-dimensional Bayesian inference in nonparametric additive models
    Shang, Zuofeng
    Li, Ping
    ELECTRONIC JOURNAL OF STATISTICS, 2014, 8 : 2804 - 2847
  • [38] High-Dimensional Knockoffs Inference for Time Series Data
    Chi, Chien-Ming
    Fan, Yingying
    Ing, Ching-Kang
    Lv, Jinchi
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2025,
  • [39] High-Dimensional Methods and Inference on Structural and Treatment Effects
    Belloni, Alexandre
    Chernozhukov, Victor
    Hansen, Christian
    JOURNAL OF ECONOMIC PERSPECTIVES, 2014, 28 (02) : 29 - 50
  • [40] Inference of heterogeneous treatment effects using observational data with high-dimensional covariates
    Qiu, Yumou
    Tao, Jing
    Zhou, Xiao-Hua
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2021, 83 (05) : 1016 - 1043