GRASP: a goodness-of-fit test for classification learning

被引：0

作者：

Javanmard, Adel ^{[1
,2
]}

Mehrabi, Mohammad ^{[1
]}

机构：

[1] Univ Southern Calif, Data Sci & Operat Dept, Los Angeles, CA USA

[2] Univ Southern Calif, Data Sci & Operat Dept, 300 Bridge Hall,3670 Trousdale Pkwy, Los Angeles, CA 90089 USA

来源：

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY | 2024年 / 86卷 / 01期

关键词：

classification; goodness-of-fit; hypothesis testing; model-X; FALSE DISCOVERY RATE; LOGISTIC-REGRESSION; MODELS; INFERENCE; ERROR;

D O I：

10.1093/jrsssb/qkad106

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterising the fit of the model to the underlying conditional law of labels given the features vector (Y | X), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessing the goodness-of-fit for a general binary classifier. Our framework does not make any parametric assumption on the conditional law Y | X and treats that as a black-box oracle model which can be accessed only through queries. We formulate the goodness-of-fit assessment problem as a tolerance hypothesis testing of the form H0 : E[Df (Bern(.(X)).Bern(...(X)))] = t where Df represents an f-divergence function, and.(x),...(x), respectively, denote the true and an estimate likelihood for a feature vector x admitting a positive label. We propose a novel test, called Goodness-of-fit with Randomisation and Scoring Procedure (GRASP) for testing H-0, which works in finite sample settings, no matter the features (distribution-free). We also propose model-X GRASP designed for model-X settings where the joint distribution of the features vector is known. Model-X GRASP uses this distributional information to achieve better power. We evaluate the performance of our tests through extensive numerical experiments.

引用

页码：215 / 245

页数：31

共 52 条

[1] Akaike H., 1998, Tsahkadsor, 1971, DOI [10.1007/978-1-4612-0919-538, DOI 10.1007/978-1-4612-1694-0_15]
[2] [Anonymous], 1959, Testing Statistical Hypotheses
[3] HYPOTHESIS TESTING FOR DENSITIES AND HIGH-DIMENSIONAL MULTINOMIALS: SHARP LOCAL MINIMAX RATES
Balakrishnan, Sivaraman
Wasserman, Larry
[J]. ANNALS OF STATISTICS, 2019, 47 (04) : 1893 - 1927
[4] ROBUST INFERENCE WITH KNOCKOFFS
Barber, Rina Foygel
Candes, Emmanuel J.
Samworth, Richard J.
[J]. ANNALS OF STATISTICS, 2020, 48 (03) : 1409 - 1431
[5] CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS
Barber, Rina Foygel
Candes, Emmanuel J.
[J]. ANNALS OF STATISTICS, 2015, 43 (05) : 2055 - 2085
[6] Bates S., 2021, ARXIV
[7] Metropolized Knockoff Sampling
Bates, Stephen
Candes, Emmanuel
Janson, Lucas
Wang, Wenshuo
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (535) : 1413 - 1427
[8] Robust Solutions of Optimization Problems Affected by Uncertain Probabilities
Ben-Tal, Aharon
den Hertog, Dick
De Waegenaere, Anja
Melenberg, Bertrand
Rennen, Gijs
[J]. MANAGEMENT SCIENCE, 2013, 59 (02) : 341 - 357
[9] The conditional permutation test for independence while controlling for confounders
Berrett, Thomas B.
Wang, Yi
Barber, Rina Foygel
Samworth, Richard J.
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (01) : 175 - 197
[10] An alternative point of view on Lepski's method
Birgé, L
[J]. STATE OF THE ART IN PROBABILITY AND STATISTICS: FESTSCHRIFT FOR WILLEM R VAN ZWET, 2001, 36 : 113 - 133

← 1 2 3 4 5 6 →