Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization

被引:0
|
作者
Miller, John [1 ]
Taori, Rohan [2 ]
Raghunathan, Aditi [2 ]
Sagawa, Shiori [2 ]
Koh, Pang Wei [2 ]
Shankar, Vaishaal [1 ]
Liang, Percy [2 ]
Carmon, Yair [3 ]
Schmidt, Ludwig [4 ]
机构
[1] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94720 USA
[2] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[3] Tel Aviv Univ, Sch Comp Sci, Tel Aviv, Israel
[4] Toyota Res Inst, Cambridge, MA USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For machine learning systems to be reliable, we must understand their performance in unseen, out-of-distribution environments. In this paper, we empirically show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts. Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet, a synthetic pose estimation task derived from YCB objects, FMoW-WILDS satellite imagery classification, and wildlife classification in iWildCam-WILDS. The correlation holds across model architectures, hyperparameters, training set size, and training duration, and is more precise than what is expected from existing domain adaptation theory. To complete the picture, we also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS. Finally, we provide a candidate theory based on a Gaussian data model that shows how changes in the data covariance arising from distribution shift can affect the observed correlations.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
    Rame, Alexandre
    Dancette, Corentin
    Cord, Matthieu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [42] On the Impact of Spurious Correlation for Out-of-Distribution Detection
    Ming, Yifei
    Yin, Hang
    Li, Yixuan
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10051 - 10059
  • [43] Out-of-distribution Detection Learning with Unreliable Out-of-distribution Sources
    Zheng, Haotian
    Wang, Qizhou
    Fang, Zhen
    Xia, Xiaobo
    Liu, Feng
    Liu, Tongliang
    Han, Bo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [44] Understanding the Generalization of Pretrained Diffusion Models on Out-of-Distribution Data
    Ramachandran, Sai Niranjan
    Mukhopadhyay, Rudrabha
    Agarwal, Madhav
    Jawahar, C. V.
    Namboodiri, Vinay
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14767 - 14775
  • [45] Out-of-Distribution Generalization by Neural-Symbolic Joint Training
    Liu, Anji
    Xu, Hongming
    Van den Broeck, Guy
    Liang, Yitao
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 12252 - 12259
  • [46] An Out-of-Distribution Generalization Framework Based on Variational Backdoor Adjustment
    Su, Hang
    Wang, Wei
    MATHEMATICS, 2024, 12 (01)
  • [47] Targeted Data-driven Regularization for Out-of-Distribution Generalization
    Kamani, Mohammad Mahdi
    Farhang, Sadegh
    Mahdavi, Mehrdad
    Wang, James Z.
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 882 - 891
  • [48] The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
    Hendrycks, Dan
    Basart, Steven
    Mu, Norman
    Kadavath, Saurav
    Wang, Frank
    Dorundo, Evan
    Desai, Rahul
    Zhu, Tyler
    Parajuli, Samyak
    Guo, Mike
    Song, Dawn
    Steinhardt, Jacob
    Gilmer, Justin
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8320 - 8329
  • [49] Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors
    Wang, Qixun
    Wang, Yifei
    Zhu, Hong
    Wang, Yisen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [50] Individual and Structural Graph Information Bottlenecks for Out-of-Distribution Generalization
    Yang, Ling
    Zheng, Jiayi
    Wang, Heyuan
    Liu, Zhongyi
    Huang, Zhilin
    Hong, Shenda
    Zhang, Wentao
    Cui, Bin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (02) : 682 - 693