Bibimbap : Pre-trained models ensemble for Domain Generalization

被引:6
作者
Kang, Jinho [1 ]
Kim, Taero [2 ]
Kim, Yewon [1 ]
Oh, Changdae [1 ]
Jung, Jiyoung [1 ]
Chang, Rakwoo [3 ]
Song, Kyungwoo [1 ,2 ,4 ]
机构
[1] Univ Seoul, Dept Artificial Intelligence, Seoul, South Korea
[2] Yonsei Univ, Dept Stat & Data Sci, Seoul, South Korea
[3] Univ Seoul, Dept Chem, Seoul, South Korea
[4] Univ Seoul, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Transfer learning; Molecular classification; Domain generalization; Weight averaging; Ensemble learning; Chemical dataset;
D O I
10.1016/j.patcog.2024.110391
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses a machine learning problem often challenged by differences in the distributions of training and real-world data. We propose a framework that addresses the problem of underfitting in the ensembling method using pre-trained models and improves the performance and robustness of deep learning models through ensemble diversity. For the naive weight ensembling framework, we discovered that the ensembled models could not lie in the same loss basin under extreme domain shift conditions, suggesting that a loss barrier may exist. We used a fine-tuning step after the weighted ensemble to address the underfitting problem caused by the loss barrier and stabilize the batch normalization running parameters. We also inferred through qualitative analysis that the diversity of ensemble models affects domain generalization. We validate our method on a large-scale image dataset (ImageNet-1K) and chemical molecule data, which is suitable for testing with domain shift problems due to its data-splitting method.
引用
收藏
页数:10
相关论文
共 48 条
[1]  
Arpit D, 2022, ADV NEUR IN
[2]  
Choshen L, 2022, Arxiv, DOI arXiv:2204.03044
[3]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[4]  
Garipov T, 2018, ADV NEUR IN, V31
[5]   ChEMBL: a large-scale bioactivity database for drug discovery [J].
Gaulton, Anna ;
Bellis, Louisa J. ;
Bento, A. Patricia ;
Chambers, Jon ;
Davies, Mark ;
Hersey, Anne ;
Light, Yvonne ;
McGlinchey, Shaun ;
Michalovich, David ;
Al-Lazikani, Bissan ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D1100-D1107
[6]   Finetune like you pretrain: Improved finetuning of zero-shot vision models [J].
Goyal, Sachin ;
Kumar, Ananya ;
Garg, Sankalp ;
Kolter, Zico ;
Raghunathan, Aditi .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :19338-19347
[7]  
Gulrajani I., 2020, INT C LEARNING REPRE
[8]  
Hamilton WL, 2017, ADV NEUR IN, V30
[9]  
Hu WH, 2020, Arxiv, DOI arXiv:1905.12265
[10]  
Hu Weihua, 2020, Advances in Neural Information Processing Systems, V33