Semantic Transfer from Head to Tail: Enlarging Tail Margin for Long-Tailed Visual Recognition

被引：0

作者：

Zhang, Shan ^{[1
]}

Ni, Yao ^{[1
]}

Du, Jinhao ^{[2
]}

Liu, Yanxia ^{[3
]}

Koniusz, Piotr ^{[1
,4
]}

机构：

[1] Australian Natl Univ, Canberra, ACT, Australia

[2] Peking Univ, Beijing, Peoples R China

[3] Beijing Union Univ, Beijing, Peoples R China

[4] Data61 CSIRO, Eveleigh, Australia

来源：

2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024 | 2024年

关键词：

D O I：

10.1109/WACV57701.2024.00138

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks excel in visual recognition tasks, but their success hinges on access to balanced datasets. Yet, real-world datasets often exhibit a long-tailed distribution, compromising network efficiency and hampering generalization on unseen data. To enhance the model's generalization in long-tailed scenarios, we present a novel feature augmentation approach termed SeMAntic tRansfer from head to Tail (SMART), which enriches the feature patterns for tail samples by transferring semantic covariance from the head classes to the tail classes along semantically correlating dimensions. This strategy boosts the model's generalization ability by implicitly and adaptively weighting the logits, thereby widening the classification margin of tail classes. Inspired by the success of this weighting, we further incorporate a semantic-aware weighting strategy for the loss tied to tail samples. This amplifies the effect of enlarging the margin for tail classes. We are the first to provide theoretical analysis that demonstrates a large semantic diversity in tail samples can increase class margins during the training stage, leading to improved generalization. Empirical observations support our theory. Notably, with no need for extra data or learnable parameters, SMART achieves state-of-the-art results on five long-tailed benchmark datasets: CIFAR-10/100-LT, Places-LT, ImageNet-LT, and iNaturalist 2018.

引用

页码：1339 / 1349

页数：11