Rethinking Domain Generalization: Discriminability and Generalizability

被引：2

作者：

Long, Shaocong ^{[1
]}

Zhou, Qianyu ^{[1
]}

Ying, Chenhao ^{[1
,2
]}

Ma, Lizhuang ^{[1
]}

Luo, Yuan ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[2] Shanghai Jiao Tong Univ, Blockchain Adv Res Ctr, Wuxi 214101, Jiangsu, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 11期

关键词：

Domain generalization; representation learning; discriminability; generalizability; transfer learning;

D O I：

10.1109/TCSVT.2024.3422887

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Domain generalization (DG) endeavours to develop robust models that possess strong generalizability while preserving excellent discriminability. Nonetheless, pivotal DG techniques tend to improve the feature generalizability by learning domain-invariant representations, inadvertently overlooking the feature discriminability. On the one hand, the simultaneous attainment of generalizability and discriminability of features presents a complex challenge, often entailing inherent contradictions. This challenge becomes particularly pronounced when domain-invariant features manifest reduced discriminability owing to the inclusion of unstable factors, i.e., spurious correlations. On the other hand, prevailing domain-invariant methods can be categorized as category-level alignment, susceptible to discarding indispensable features possessing substantial generalizability and narrowing intra-class variations. To surmount these obstacles, we rethink DG from a new perspective that concurrently imbues features with formidable discriminability and robust generalizability, and present a novel framework, namely, Discriminative Microscopic Distribution Alignment (DMDA). DMDA incorporates two core components: Selective Channel Pruning (SCP) and Micro-level Distribution Alignment (MDA). Concretely, SCP attempts to curtail redundancy within neural networks, prioritizing stable attributes conducive to accurate classification. This approach alleviates the adverse effect of spurious domain-invariance and amplifies the feature discriminability. Besides, MDA accentuates micro-level alignment within each class, going beyond mere category-level alignment. This strategy accommodates sufficient generalizable features and facilitates within-class variations. Extensive experiments on four benchmark datasets corroborate that DMDA achieves comparable results to state-of-the-art methods in DG, underscoring the efficacy of our method. The source code will be available at https://github.com/longshaocong/DMDA.

引用

页码：11783 / 11797

页数：15

共 93 条

[1] Krizhevsky A., Sutskever I., Hinton G.E., ImageNet classification with deep convolutional neural networks, Proc. Adv. Neural Inf. Process. Syst., pp. 84-90, (2012)
[2] He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770-778, (2016)
[3] Dosovitskiy A., Et al., An image is worth 16x16 words: Transformers for image recognition at scale, Proc. Int. Conf. Learn. Represent., pp. 1-22, (2020)
[4] Feng Z., Et al., DMT: Dynamic mutual training for semi-supervised learning, Pattern Recognit., 130, (2022)
[5] Song Y., Zhou Q., Ma L., Rethinking implicit neural representations for vision learners, Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), pp. 1-5, (2023)
[6] Long J., Shelhamer E., Darrell T., Fully convolutional networks for semantic segmentation, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3431-3440, (2015)
[7] Ronneberger O., Fischer P., Brox T., U-Net: Convolutional networks for biomedical image segmentation, Proc. Med. Image Comput. Comput.-Assist. Intervent., pp. 234-241, (2015)
[8] Chen L.-C., Zhu Y., Papandreou G., Schroff F., Adam H., Encoder-decoder with atrous separable convolution for semantic image segmentation, Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 801-818, (2018)
[9] Ren S., He K., Girshick R., Sun J., Faster R-CNN: Towards real-time object detection with region proposal networks, Proc. Adv. Neural Inf. Process. Syst., pp. 91-99, (2015)
[10] Lin T.Y., Dollar P., Girshick R., He K., Hariharan B., Belongie S., Feature pyramid networks for object detection, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 2117-2125, (2017)

← 1 2 3 4 5 6 7 8 9 10 →