GRASP: a goodness-of-fit test for classification learning

被引:0
作者
Javanmard, Adel [1 ,2 ]
Mehrabi, Mohammad [1 ]
机构
[1] Univ Southern Calif, Data Sci & Operat Dept, Los Angeles, CA USA
[2] Univ Southern Calif, Data Sci & Operat Dept, 300 Bridge Hall,3670 Trousdale Pkwy, Los Angeles, CA 90089 USA
关键词
classification; goodness-of-fit; hypothesis testing; model-X; FALSE DISCOVERY RATE; LOGISTIC-REGRESSION; MODELS; INFERENCE; ERROR;
D O I
10.1093/jrsssb/qkad106
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterising the fit of the model to the underlying conditional law of labels given the features vector (Y | X), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessing the goodness-of-fit for a general binary classifier. Our framework does not make any parametric assumption on the conditional law Y | X and treats that as a black-box oracle model which can be accessed only through queries. We formulate the goodness-of-fit assessment problem as a tolerance hypothesis testing of the form H0 : E[Df (Bern(.(X)).Bern(...(X)))] = t where Df represents an f-divergence function, and.(x),...(x), respectively, denote the true and an estimate likelihood for a feature vector x admitting a positive label. We propose a novel test, called Goodness-of-fit with Randomisation and Scoring Procedure (GRASP) for testing H-0, which works in finite sample settings, no matter the features (distribution-free). We also propose model-X GRASP designed for model-X settings where the joint distribution of the features vector is known. Model-X GRASP uses this distributional information to achieve better power. We evaluate the performance of our tests through extensive numerical experiments.
引用
收藏
页码:215 / 245
页数:31
相关论文
共 52 条
  • [1] Akaike H., 1998, Tsahkadsor, 1971, DOI [10.1007/978-1-4612-0919-538, DOI 10.1007/978-1-4612-1694-0_15]
  • [2] [Anonymous], 1959, Testing Statistical Hypotheses
  • [3] HYPOTHESIS TESTING FOR DENSITIES AND HIGH-DIMENSIONAL MULTINOMIALS: SHARP LOCAL MINIMAX RATES
    Balakrishnan, Sivaraman
    Wasserman, Larry
    [J]. ANNALS OF STATISTICS, 2019, 47 (04) : 1893 - 1927
  • [4] ROBUST INFERENCE WITH KNOCKOFFS
    Barber, Rina Foygel
    Candes, Emmanuel J.
    Samworth, Richard J.
    [J]. ANNALS OF STATISTICS, 2020, 48 (03) : 1409 - 1431
  • [5] CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS
    Barber, Rina Foygel
    Candes, Emmanuel J.
    [J]. ANNALS OF STATISTICS, 2015, 43 (05) : 2055 - 2085
  • [6] Bates S., 2021, ARXIV
  • [7] Metropolized Knockoff Sampling
    Bates, Stephen
    Candes, Emmanuel
    Janson, Lucas
    Wang, Wenshuo
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (535) : 1413 - 1427
  • [8] Robust Solutions of Optimization Problems Affected by Uncertain Probabilities
    Ben-Tal, Aharon
    den Hertog, Dick
    De Waegenaere, Anja
    Melenberg, Bertrand
    Rennen, Gijs
    [J]. MANAGEMENT SCIENCE, 2013, 59 (02) : 341 - 357
  • [9] The conditional permutation test for independence while controlling for confounders
    Berrett, Thomas B.
    Wang, Yi
    Barber, Rina Foygel
    Samworth, Richard J.
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (01) : 175 - 197
  • [10] An alternative point of view on Lepski's method
    Birgé, L
    [J]. STATE OF THE ART IN PROBABILITY AND STATISTICS: FESTSCHRIFT FOR WILLEM R VAN ZWET, 2001, 36 : 113 - 133