TWO-SAMPLE TESTING OF HIGH-DIMENSIONAL LINEAR REGRESSION COEFFICIENTS VIA COMPLEMENTARY SKETCHING

被引:0
|
作者
Gao, Fengnan [1 ]
Wang, Tengyao [2 ]
机构
[1] Fudan Univ, Shanghai Ctr Math Sci, Sch Data Sci, Shanghai, Peoples R China
[2] London Sch Econ, Dept Stat, London, England
基金
英国工程与自然科学研究理事会;
关键词
Two-sample hypotheses testing; high-dimensional data; linear model; sparsity; minimax detection; ANOVA;
D O I
10.1214/22-AOS2216
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We introduce a new method for two-sample testing of high-dimensional linear regression coefficients without assuming that those coefficients are individually estimable. The procedure works by first projecting the matrices of covariates and response vectors along directions that are complementary in sign in a subset of the coordinates, a process which we call "complementary sketching." The resulting projected covariates and responses are aggregated to form two test statistics, which are shown to have essentially optimal asymptotic power under a Gaussian design when the difference between the two regression coefficients is sparse and dense respectively. Simulations confirm that our methods perform well in a broad class of settings and an application to a large single-cell RNA sequencing dataset demonstrates its utility in the real world.
引用
收藏
页码:2950 / 2972
页数:23
相关论文
共 50 条
  • [41] TWO SAMPLE TESTS FOR HIGH-DIMENSIONAL COVARIANCE MATRICES
    Li, Jun
    Chen, Song Xi
    ANNALS OF STATISTICS, 2012, 40 (02) : 908 - 940
  • [42] The sparsity and bias of the lasso selection in high-dimensional linear regression
    Zhang, Cun-Hui
    Huang, Jian
    ANNALS OF STATISTICS, 2008, 36 (04) : 1567 - 1594
  • [43] A high dimensional two-sample test under a low dimensional factor structure
    Ma, Yingying
    Lan, Wei
    Wang, Hansheng
    JOURNAL OF MULTIVARIATE ANALYSIS, 2015, 140 : 162 - 170
  • [44] Test for high dimensional regression coefficients of partially linear models
    Wang, Siyang
    Cui, Hengjian
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2020, 49 (17) : 4091 - 4116
  • [45] Two-sample Behrens-Fisher problems for high-dimensional data: a normal reference scale-invariant test
    Zhang, Liang
    Zhu, Tianming
    Zhang, Jin-Ting
    JOURNAL OF APPLIED STATISTICS, 2023, 50 (03) : 456 - 476
  • [46] High-dimensional general linear hypothesis testing under heteroscedasticity
    Zhou, Bu
    Guo, Jia
    Zhang, Jin-Ting
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2017, 188 : 36 - 54
  • [47] A Sequential Rejection Testing Method for High-Dimensional Regression with Correlated Variables
    Mandozzi, Jacopo
    Buhlmann, Peter
    INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2016, 12 (01) : 79 - 95
  • [48] Two sample test for high-dimensional partially paired data
    Lee, Seokho
    Lim, Johan
    Sohn, Insuk
    Jung, Sin-Ho
    Park, Cheol-Keun
    JOURNAL OF APPLIED STATISTICS, 2015, 42 (09) : 1946 - 1961
  • [49] Robust and sparse estimation methods for high-dimensional linear and logistic regression
    Kurnaz, Fatma Sevinc
    Hoffmann, Irene
    Filzmoser, Peter
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 172 : 211 - 222
  • [50] Reduced rank regression with matrix projections for high-dimensional multivariate linear regression model
    Guo, Wenxing
    Balakrishnan, Narayanaswamy
    Bian, Mengjie
    ELECTRONIC JOURNAL OF STATISTICS, 2021, 15 (02): : 4167 - 4191