Model-Free Statistical Inference on High-Dimensional Data

被引:1
作者
Guo, Xu [1 ]
Li, Runze [2 ]
Zhang, Zhe [2 ]
Zou, Changliang [3 ]
机构
[1] Beijing Normal Univ, Sch Stat, Beijing, Peoples R China
[2] Penn State Univ, Dept Stat, University Pk, PA 16801 USA
[3] Nankai Univ, Sch Stat & Data Sci, Tianjin, Peoples R China
基金
美国国家卫生研究院; 国家重点研发计划; 中国国家自然科学基金;
关键词
False discovery rate control; Marginal coordinate hypothesis; Orthogonality; Sufficient dimension reduction; FALSE DISCOVERY RATE; CONFIDENCE-INTERVALS; VARIABLE SELECTION; REDUCTION; REGRESSION; TESTS; VARIANCE; REGIONS; PARAMETERS;
D O I
10.1080/01621459.2024.2310314
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article aims to develop an effective model-free inference procedure for high-dimensional data. We first reformulate the hypothesis testing problem via sufficient dimension reduction framework. With the aid of new reformulation, we propose a new test statistic and show that its asymptotic distribution is chi(2) distribution whose degree of freedom does not depend on the unknown population distribution. We further conduct power analysis under local alternative hypotheses. In addition, we study how to control the false discovery rate of the proposed chi(2) tests, which are correlated, to identify important predictors under a model-free framework. To this end, we propose a multiple testing procedure and establish its theoretical guarantees. Monte Carlo simulation studies are conducted to assess the performance of the proposed tests and an empirical analysis of a real-world dataset is used to illustrate the proposed methodology. Supplementary materials for this article are available online including a standardized description of the materials available for reproducing the work.
引用
收藏
页码:186 / 197
页数:12
相关论文
共 50 条
[1]   Statistical applications of the multivariate skew normal distribution [J].
Azzalini, A ;
Capitanio, A .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 :579-602
[2]   ROBUST INFERENCE WITH KNOCKOFFS [J].
Barber, Rina Foygel ;
Candes, Emmanuel J. ;
Samworth, Richard J. .
ANNALS OF STATISTICS, 2020, 48 (03) :1409-1431
[3]   CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS [J].
Barber, Rina Foygel ;
Candes, Emmanuel J. .
ANNALS OF STATISTICS, 2015, 43 (05) :2055-2085
[4]   Uniform post-selection inference for least absolute deviation regression and other Z-estimation problems [J].
Belloni, A. ;
Chernozhukov, V. ;
Kato, K. .
BIOMETRIKA, 2015, 102 (01) :77-94
[5]   UNIFORMLY VALID POST-REGULARIZATION CONFIDENCE REGIONS FOR MANY FUNCTIONAL PARAMETERS IN Z-ESTIMATION FRAMEWORK [J].
Belloni, Alexandre ;
Chernozhukov, Victor ;
Chetverikov, Denis ;
Wei, Ying .
ANNALS OF STATISTICS, 2018, 46 (6B) :3643-3675
[6]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[7]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[8]   A comparison of normalization methods for high density oligonucleotide array data based on variance and bias [J].
Bolstad, BM ;
Irizarry, RA ;
Åstrand, M ;
Speed, TP .
BIOINFORMATICS, 2003, 19 (02) :185-193
[9]   LARGE-SCALE SIMULTANEOUS TESTING OF CROSS-COVARIANCE MATRICES WITH APPLICATIONS TO PheWAS [J].
Cai, Tianxi ;
Cai, T. Tony ;
Liao, Katherine ;
Liu, Weidong .
STATISTICA SINICA, 2019, 29 (02) :983-1005
[10]   Panning for gold: "model-X' knockoffs for high dimensional controlled variable selection [J].
Candes, Emmanuel ;
Fan, Yingying ;
Janson, Lucas ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2018, 80 (03) :551-577