Rethinking Domain Generalization: Discriminability and Generalizability

被引:2
作者
Long, Shaocong [1 ]
Zhou, Qianyu [1 ]
Ying, Chenhao [1 ,2 ]
Ma, Lizhuang [1 ]
Luo, Yuan [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Blockchain Adv Res Ctr, Wuxi 214101, Jiangsu, Peoples R China
关键词
Domain generalization; representation learning; discriminability; generalizability; transfer learning;
D O I
10.1109/TCSVT.2024.3422887
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Domain generalization (DG) endeavours to develop robust models that possess strong generalizability while preserving excellent discriminability. Nonetheless, pivotal DG techniques tend to improve the feature generalizability by learning domain-invariant representations, inadvertently overlooking the feature discriminability. On the one hand, the simultaneous attainment of generalizability and discriminability of features presents a complex challenge, often entailing inherent contradictions. This challenge becomes particularly pronounced when domain-invariant features manifest reduced discriminability owing to the inclusion of unstable factors, i.e., spurious correlations. On the other hand, prevailing domain-invariant methods can be categorized as category-level alignment, susceptible to discarding indispensable features possessing substantial generalizability and narrowing intra-class variations. To surmount these obstacles, we rethink DG from a new perspective that concurrently imbues features with formidable discriminability and robust generalizability, and present a novel framework, namely, Discriminative Microscopic Distribution Alignment (DMDA). DMDA incorporates two core components: Selective Channel Pruning (SCP) and Micro-level Distribution Alignment (MDA). Concretely, SCP attempts to curtail redundancy within neural networks, prioritizing stable attributes conducive to accurate classification. This approach alleviates the adverse effect of spurious domain-invariance and amplifies the feature discriminability. Besides, MDA accentuates micro-level alignment within each class, going beyond mere category-level alignment. This strategy accommodates sufficient generalizable features and facilitates within-class variations. Extensive experiments on four benchmark datasets corroborate that DMDA achieves comparable results to state-of-the-art methods in DG, underscoring the efficacy of our method. The source code will be available at https://github.com/longshaocong/DMDA.
引用
收藏
页码:11783 / 11797
页数:15
相关论文
共 93 条
  • [1] Krizhevsky A., Sutskever I., Hinton G.E., ImageNet classification with deep convolutional neural networks, Proc. Adv. Neural Inf. Process. Syst., pp. 84-90, (2012)
  • [2] He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770-778, (2016)
  • [3] Dosovitskiy A., Et al., An image is worth 16x16 words: Transformers for image recognition at scale, Proc. Int. Conf. Learn. Represent., pp. 1-22, (2020)
  • [4] Feng Z., Et al., DMT: Dynamic mutual training for semi-supervised learning, Pattern Recognit., 130, (2022)
  • [5] Song Y., Zhou Q., Ma L., Rethinking implicit neural representations for vision learners, Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), pp. 1-5, (2023)
  • [6] Long J., Shelhamer E., Darrell T., Fully convolutional networks for semantic segmentation, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3431-3440, (2015)
  • [7] Ronneberger O., Fischer P., Brox T., U-Net: Convolutional networks for biomedical image segmentation, Proc. Med. Image Comput. Comput.-Assist. Intervent., pp. 234-241, (2015)
  • [8] Chen L.-C., Zhu Y., Papandreou G., Schroff F., Adam H., Encoder-decoder with atrous separable convolution for semantic image segmentation, Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 801-818, (2018)
  • [9] Ren S., He K., Girshick R., Sun J., Faster R-CNN: Towards real-time object detection with region proposal networks, Proc. Adv. Neural Inf. Process. Syst., pp. 91-99, (2015)
  • [10] Lin T.Y., Dollar P., Girshick R., He K., Hariharan B., Belongie S., Feature pyramid networks for object detection, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 2117-2125, (2017)