MMD Aggregated Two-Sample Test

被引:0
作者
Schrab, Antonin [1 ,2 ]
Kim, Ilmun [3 ]
Albert, Melisande [4 ,5 ,6 ]
Laurent, Beatrice [4 ,5 ,6 ]
Guedj, Benjamin [1 ,7 ]
Gretton, Arthur [8 ]
机构
[1] UCL, Ctr Artificial Intelligence, London WC1V 6LJ, England
[2] UCL, Gatsby Computat Neurosci Unit, Inria London, London WC1V 6LJ, England
[3] Yonsei Univ, Dept Stat & Data Sci, Dept Appl Stat, Seoul 03722, South Korea
[4] Inst Math Toulouse, Toulouse, France
[5] Univ Toulouse, UMR 5219, Toulouse, France
[6] CNRS, INSA, Paris, France
[7] Inria London, London WC1V 6LJ, England
[8] UCL, Gatsby Computat Neurosci Unit, London W1T 4JG, England
基金
新加坡国家研究基金会; 英国工程与自然科学研究理事会;
关键词
two-sample testing; kernel methods; minimax adaptivity;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose two novel nonparametric two-sample kernel tests based on the Maximum Mean Discrepancy (MMD). First, for a fixed kernel, we construct an MMD test using ei-ther permutations or a wild bootstrap, two popular numerical procedures to determine the test threshold. We prove that this test controls the probability of type I error non-asymptotically. Hence, it can be used reliably even in settings with small sample sizes as it remains well-calibrated, which differs from previous MMD tests which only guarantee correct test level asymptotically. When the difference in densities lies in a Sobolev ball, we prove minimax optimality of our MMD test with a specific kernel depending on the smoothness parameter of the Sobolev ball. In practice, this parameter is unknown and, hence, the optimal MMD test with this particular kernel cannot be used. To overcome this issue, we construct an aggregated test, called MMDAgg, which is adaptive to the smoothness parameter. The test power is maximised over the collection of kernels used, without requiring held-out data for kernel selection (which results in a loss of test power), or arbitrary kernel choices such as the median heuristic. We prove that MMDAgg still controls the level non-asymptotically, and achieves the minimax rate over Sobolev balls, up to an iterated logarithmic term. Our guarantees are not restricted to a specific type of kernel, but hold for any product of one-dimensional translation invariant characteristic kernels. We provide a user-friendly parameter-free implementation of MMDAgg using an adaptive collection of bandwidths. We demonstrate that MMDAgg significantly outper-forms alternative state-of-the-art MMD-based two-sample tests on synthetic data satisfying the Sobolev smoothness assumption, and that, on real-world image data, MMDAgg closely matches the power of tests leveraging the use of models such as neural networks.
引用
收藏
页数:81
相关论文
共 56 条
[1]   ADAPTIVE TEST OF INDEPENDENCE BASED ON HSIC MEASURES [J].
Albert, Melisande ;
Laurent, Beatrice ;
Marrel, Amandine ;
Meynaoui, Anouar .
ANNALS OF STATISTICS, 2022, 50 (02) :858-879
[2]  
[Anonymous], 1908, BIOMETRIKA, V6, P1
[3]  
[Anonymous], 2012, Advances in neural information processing systems
[4]   THEORY OF REPRODUCING KERNELS [J].
ARONSZAJN, N .
TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1950, 68 (MAY) :337-404
[5]  
Baraud Y, 2002, BERNOULLI, V8, P577
[6]   A DISTRIBUTION FREE VERSION OF SMIRNOV 2 SAMPLE TEST IN P-VARIATE CASE [J].
BICKEL, PJ .
ANNALS OF MATHEMATICAL STATISTICS, 1969, 40 (01) :1-&
[7]   Goodness-of-fit testing and quadratic functional estimation from indirect observations [J].
Butucea, Cristina .
ANNALS OF STATISTICS, 2007, 35 (05) :1907-1930
[8]  
Chebyshev P. L., 1899, Oeuvres. Commissionaires de l'Academie Imperiale des Sciences, V1
[9]   A TWO-SAMPLE TEST FOR HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO GENE-SET TESTING [J].
Chen, Song Xi ;
Qin, Ying-Li .
ANNALS OF STATISTICS, 2010, 38 (02) :808-835
[10]  
Chwialkowski K, 2016, PR MACH LEARN RES, V48