Sequential Predictive Two-Sample and Independence Testing

被引:0
作者
Podkopaev, Aleksandr [1 ]
Ramdas, Aaditya [2 ]
机构
[1] Walmart Global Tech, San Bruno, CA 94086 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problems of sequential nonparametric two-sample and independence testing. Sequential tests process data online and allow using observed data to decide whether to stop and reject the null hypothesis or to collect more data, while maintaining type I error control. We build upon the principle of (nonparametric) testing by betting, where a gambler places bets on future observations and their wealth measures evidence against the null hypothesis. While recently developed kernel-based betting strategies often work well on simple distributions, selecting a suitable kernel for high-dimensional or structured data, such as images, is often nontrivial. To address this drawback, we design prediction-based betting strategies that rely on the following fact: if a sequentially updated predictor starts to consistently determine (a) which distribution an instance is drawn from, or (b) whether an instance is drawn from the joint distribution or the product of the marginal distributions (the latter produced by external randomization), it provides evidence against the two-sample or independence nulls respectively. We empirically demonstrate the superiority of our tests over kernel-based approaches under structured settings. Our tests can be applied beyond the case of independent and identically distributed data, remaining valid and powerful even when the data distribution drifts over time.
引用
收藏
页数:33
相关论文
共 33 条
[1]  
[Anonymous], 1998, TECHNICAL REPORT
[2]  
Berrett Thomas B., 2019, BIOMETRIKA
[3]  
Cheng Xiuyuan, 2022, IEEE T INFORM THEORY
[4]  
Cutkosky Ashok, 2018, C LEARN THEOR
[5]  
Darling Donald A., 1968, P NATL ACAD SCI
[6]  
Friedman Jerome H., 2004, TECHNICAL REPORT
[7]  
Gretton A, 2005, LECT NOTES ARTIF INT, V3734, P63
[8]  
Gretton A, 2012, J MACH LEARN RES, V13, P723
[9]   Anytime-Valid Tests of Conditional Independence Under Model-X [J].
Grunwald, Peter ;
Henzi, Alexander ;
Lardy, Tyron .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (546) :1554-1565
[10]  
Hazan Elad, 2007, MACHINE LEARNING