Generating knockoffs via conditional independence

被引:0
作者
Dreassi, Emanuela [1 ]
Leisen, Fabrizio [2 ]
Pratelli, Luca [3 ]
Rigo, Pietro [4 ]
机构
[1] Univ Firenze, Florence, Italy
[2] Kings Coll London, London, England
[3] Acad Navale Livorno, Livorno, Italy
[4] Univ Bologna, Bologna, Italy
来源
ELECTRONIC JOURNAL OF STATISTICS | 2024年 / 18卷 / 01期
关键词
Approximation; conditional independence; high- dimensional regression; knockoffs; multivariate dependence; partial exchange- ability; variable selection; FALSE DISCOVERY RATE; INFERENCE;
D O I
10.1214/23-EJS2198
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Let X be a p-variate random vector and X a knockoff copy of X (in the sense of [9]). A new approach for constructing X (henceforth, NA) has been introduced in [8]. NA has essentially three advantages: (i) To build X is straightforward; (ii) The joint distribution of (X, X) can be written in closed form; (iii) X is often optimal under various criteria. However, for NA to apply, X1, ... , Xp should be conditionally independent given some random element Z. Our first result is that any probability measure mu on Rp can be approximated by a probability measure mu 0 of the form �p � pi mu 0 (A1 x center dot center dot center dot x Ap) = E P(XiE Ai | Z) . i=1 The approximation is in total variation distance when mu is absolutely continuous, and an explicit formula for mu 0 is provided. If X similar to mu 0, then X1, ... , Xp are conditionally independent. Hence, with a negligible error, one can assume X similar to mu 0 and build X � through NA. Our second result is a characterization of the knockoffs X obtained via NA. It is shown that X is of this type if and only if the pair (X, X) can be extended to an infinite sequence so as to satisfy certain invariance conditions. The basic tool for proving this fact is de Finetti's theorem for partially exchangeable sequences. In addition to the quoted results, an explicit formula for the conditional distribution of X � given X is obtained in a few cases. In one of such cases, it is assumed Xi E {0, 1} for all i.
引用
收藏
页码:119 / 144
页数:26
相关论文
共 20 条
  • [1] Aldous David J., 1985, ECOLE DETE PROBABILI, DOI DOI 10.1007/BFB0099421
  • [2] TESTING GOODNESS-OF-FIT AND CONDITIONAL INDEPENDENCE WITH APPROXIMATE CO-SUFFICIENT SAMPLING
    Barber, Rina Foygel
    Janson, Lucas
    [J]. ANNALS OF STATISTICS, 2022, 50 (05) : 2514 - 2544
  • [3] ROBUST INFERENCE WITH KNOCKOFFS
    Barber, Rina Foygel
    Candes, Emmanuel J.
    Samworth, Richard J.
    [J]. ANNALS OF STATISTICS, 2020, 48 (03) : 1409 - 1431
  • [4] CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS
    Barber, Rina Foygel
    Candes, Emmanuel J.
    [J]. ANNALS OF STATISTICS, 2015, 43 (05) : 2055 - 2085
  • [5] Metropolized Knockoff Sampling
    Bates, Stephen
    Candes, Emmanuel
    Janson, Lucas
    Wang, Wenshuo
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (535) : 1413 - 1427
  • [6] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [7] Limit theorems for a class of identically distributed random variables
    Berti, P
    Pratelli, L
    Rigo, P
    [J]. ANNALS OF PROBABILITY, 2004, 32 (3A) : 2029 - 2052
  • [8] New perspectives on knockoffs construction
    Berti, Patrizia
    Dreassi, Emanuela
    Leisen, Fabrizio
    Pratelli, Luca
    Rigo, Pietro
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2023, 223 : 1 - 14
  • [9] Panning for gold: "model-X' knockoffs for high dimensional controlled variable selection
    Candes, Emmanuel
    Fan, Yingying
    Janson, Lucas
    Lv, Jinchi
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2018, 80 (03) : 551 - 577
  • [10] DE FINETTI THEOREM FOR MARKOV-CHAINS
    DIACONIS, P
    FREEDMAN, D
    [J]. ANNALS OF PROBABILITY, 1980, 8 (01) : 115 - 130