More Powerful Selective Inference for the Graph Fused Lasso

被引:6
作者
Chen, Yiqun [1 ]
Jewell, Sean [2 ]
Witten, Daniela [1 ,2 ]
机构
[1] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
[2] Univ Washington, Dept Stat, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
Changepoint detection; Hypothesis testing; Penalized regression; Piecewise constant; POINT;
D O I
10.1080/10618600.2022.2097246
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The graph fused lasso-which includes as a special case the one-dimensional fused lasso-is widely used to reconstruct signals that are piecewise constant on a graph, meaning that nodes connected by an edge tend to have identical values. We consider testing for a difference in the means of two connected components estimated using the graph fused lasso. A naive procedure such as a z-test for a difference in means will not control the selective Type I error, since the hypothesis that we are testing is itself a function of the data. In this work, we propose a new test for this task that controls the selective Type I error, and conditions on less information than existing approaches, leading to substantially higher power. We illustrate our approach in simulation and on datasets of drug overdose death rates and teenage birth rates in the contiguous United States. Our approach yields more discoveries on both datasets. Supplementary materials for this article are available online.
引用
收藏
页码:577 / 587
页数:11
相关论文
共 41 条
[1]   Epidemiologic Surveillance of Teenage Birth Rates in the United States, 2006-2012 [J].
Amin, Raid ;
Decesare, Julie Zemaitis ;
Hans, Jennifer ;
Roussos-Ross, Kay .
OBSTETRICS AND GYNECOLOGY, 2017, 129 (06) :1068-1077
[2]   Efficient Implementations of the Generalized Lasso Dual Path Algorithm [J].
Arnold, Taylor B. ;
Tibshirani, Ryan J. .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2016, 25 (01) :1-27
[3]  
Centers for Disease Control and Prevention, 2020, 2018 TEEN BIRTH RAT
[4]   Asymptotic post-selection inference for the Akaike information criterion [J].
Charkhi, Ali ;
Claeskens, Gerda .
BIOMETRIKA, 2018, 105 (03) :645-664
[5]   Valid Inference Corrected for Outlier Removal [J].
Chen, Shuxiao ;
Bien, Jacob .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2020, 29 (02) :323-334
[6]  
Duy Le., 2021, ARXIV
[7]  
Fithian W., 2014, Optimal inference after model selection
[8]   PATHWISE COORDINATE OPTIMIZATION [J].
Friedman, Jerome ;
Hastie, Trevor ;
Hoefling, Holger ;
Tibshirani, Robert .
ANNALS OF APPLIED STATISTICS, 2007, 1 (02) :302-332
[9]  
Gao Leo, 2020, The pile: An 800gb dataset of diverse text for language modeling
[10]   Multiple Change-Point Estimation With a Total Variation Penalty [J].
Harchaoui, Z. ;
Levy-Leduc, C. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (492) :1480-1493