Rethinking Domain Generalization: Discriminability and Generalizability

被引：2

作者：

Long, Shaocong ^{[1
]}

Zhou, Qianyu ^{[1
]}

Ying, Chenhao ^{[1
,2
]}

Ma, Lizhuang ^{[1
]}

Luo, Yuan ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[2] Shanghai Jiao Tong Univ, Blockchain Adv Res Ctr, Wuxi 214101, Jiangsu, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 11期

关键词：

Domain generalization; representation learning; discriminability; generalizability; transfer learning;

D O I：

10.1109/TCSVT.2024.3422887

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Domain generalization (DG) endeavours to develop robust models that possess strong generalizability while preserving excellent discriminability. Nonetheless, pivotal DG techniques tend to improve the feature generalizability by learning domain-invariant representations, inadvertently overlooking the feature discriminability. On the one hand, the simultaneous attainment of generalizability and discriminability of features presents a complex challenge, often entailing inherent contradictions. This challenge becomes particularly pronounced when domain-invariant features manifest reduced discriminability owing to the inclusion of unstable factors, i.e., spurious correlations. On the other hand, prevailing domain-invariant methods can be categorized as category-level alignment, susceptible to discarding indispensable features possessing substantial generalizability and narrowing intra-class variations. To surmount these obstacles, we rethink DG from a new perspective that concurrently imbues features with formidable discriminability and robust generalizability, and present a novel framework, namely, Discriminative Microscopic Distribution Alignment (DMDA). DMDA incorporates two core components: Selective Channel Pruning (SCP) and Micro-level Distribution Alignment (MDA). Concretely, SCP attempts to curtail redundancy within neural networks, prioritizing stable attributes conducive to accurate classification. This approach alleviates the adverse effect of spurious domain-invariance and amplifies the feature discriminability. Besides, MDA accentuates micro-level alignment within each class, going beyond mere category-level alignment. This strategy accommodates sufficient generalizable features and facilitates within-class variations. Extensive experiments on four benchmark datasets corroborate that DMDA achieves comparable results to state-of-the-art methods in DG, underscoring the efficacy of our method. The source code will be available at https://github.com/longshaocong/DMDA.

引用

页码：11783 / 11797

页数：15

共 93 条

[11] Redmon J., Divvala S., Girshick R., Farhadi A., You only look once: Unified, real-time object detection, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 779-788, (2016)
[12] Zhou Q., Et al., TransVOD: End-to-end video object detection with spatial-temporal transformers, IEEE Trans. Pattern Anal. Mach. Intell., 45, 6, pp. 7853-7869, (2023)
[13] He L., Et al., End-to-end video object detection with spatial-temporal transformers, Proc. 29th ACM Int. Conf. Multimedia, pp. 1507-1516, (2021)
[14] Li A., Pearl J., Bounds on causal effects and application to high dimensional data, Proc. AAAI Conf. Artif. Intell., pp. 5773-5780, (2022)
[15] Perry R., Von Kugelgen J., Scholkopf B., Causal discovery in heterogeneous environments under the sparse mechanism shift hypothesis, Proc. Adv. Neural Inf. Process. Syst., pp. 10904-10917, (2022)
[16] Fang T., Lu N., Niu G., Sugiyama M., Rethinking importance weighting for deep learning under distribution shift, Proc. Adv. Neural Inf. Process. Syst., pp. 11996-12007, (2020)
[17] Wang Y., Qi L., Shi Y., Gao Y., Feature-based style randomization for domain generalization, IEEE Trans. Circuits Syst. Video Technol., 32, 8, pp. 5495-5509, (2022)
[18] Tian Q., Zhu Y., Sun H., Chen S., Yin H., Unsupervised domain adaptation through dynamically aligning both the feature and label spaces, IEEE Trans. Circuits Syst. Video Technol., 32, 12, pp. 8562-8573, (2022)
[19] Zhou Q., Et al., Context-aware mixup for domain adaptive semantic segmentation, IEEE Trans. Circuits Syst. Video Technol., 33, 2, pp. 804-817, (2023)
[20] Ren Y., Cong Y., Dong J., Sun G., Uni3DA: Universal 3D domain adaptation for object recognition, IEEE Trans. Circuits Syst. Video Technol., 33, 1, pp. 379-392, (2023)

← 1 2 3 4 5 6 7 8 9 10 →