Alleviating Over-Fitting in Hashing-Based Fine-Grained Image Retrieval: From Causal Feature Learning to Binary-Injected Hash Learning

被引:2
作者
Xiang, Xinguang [1 ]
Ding, Xinhao [1 ]
Jin, Lu [1 ]
Li, Zechao [1 ]
Tang, Jinhui [1 ]
Jain, Ramesh [2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Univ Calif Irvine, Irvine 92617, CA USA
基金
中国国家自然科学基金;
关键词
Codes; Representation learning; Feature extraction; Task analysis; Image retrieval; Noise; Binary codes; Hashing-based fine-grained image retrieval; over-fitting; causal inference;
D O I
10.1109/TMM.2024.3410136
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hashing-based fine-grained image retrieval pursues learning diverse local features to generate inter-class discriminative hash codes. However, existing fine-grained hash methods with attention mechanisms usually tend to just focus on a few obvious areas, which misguides the network to over-fit some salient features. Such a problem raises two main limitations. 1) It overlooks some subtle local features, degrading the generalization capability of learned embedding. 2) It causes the over-activation of some hash bits correlated to salient features, which breaks the binary code balance and further weakens the discrimination abilities of hash codes. To address these limitations of the over-fitting problem, we propose a novel hash framework from <bold>C</bold>ausal <bold>F</bold>eature learning to <bold>B</bold>inary-injected <bold>H</bold>ash learning (<bold>CFBH</bold>), which captures various local information and suppresses over-activated hash bits simultaneously. For causal feature learning, we adopt causal inference theory to alleviate the bias towards the salient regions in fine-grained images. In detail, we obtain local features from the feature map and combine this local information with original image information followed by this theory. Theoretically, these fused embeddings help the network to re-weight the retrieval effort of each local feature and exploit more subtle variations without observational bias. For binary-injected hash learning, we propose a Binary Noise Injection (BNI) module inspired by Dropout. The BNI module not only mitigates over-activation to particular bits, but also makes hash codes uncorrelated and balanced in the Hamming space. Extensive experimental results on six popular fine-grained image datasets demonstrate the superiority of CFBH over several State-of-the-Art methods.
引用
收藏
页码:10665 / 10677
页数:13
相关论文
共 68 条
[1]  
Bossard L, 2014, LECT NOTES COMPUT SC, V8694, P446, DOI 10.1007/978-3-319-10599-4_29
[2]   Hashing with Mutual Information [J].
Cakir, Fatih ;
He, Kun ;
Bargal, Sarah Adel ;
Sclaroff, Stan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (10) :2424-2437
[3]   HashNet: Deep Learning to Hash by Continuation [J].
Cao, Zhangjie ;
Long, Mingsheng ;
Wang, Jianmin ;
Yu, Philip S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5609-5618
[4]   Fine-Grained Hashing With Double Filtering [J].
Chen, Zhen-Duo ;
Luo, Xin ;
Wang, Yongxin ;
Guo, Shanqing ;
Xu, Xin-Shun .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :1671-1683
[5]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]  
Deng X, 2021, ADV NEUR IN, V34
[7]  
Fan LX, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P825
[8]   Shortcut learning in deep neural networks [J].
Geirhos, Robert ;
Jacobsen, Joern-Henrik ;
Michaelis, Claudio ;
Zemel, Richard ;
Brendel, Wieland ;
Bethge, Matthias ;
Wichmann, Felix A. .
NATURE MACHINE INTELLIGENCE, 2020, 2 (11) :665-673
[9]  
Goldberg L. R., 2018, The Book of Why: The New Science of Cause and Effect
[10]   Learning Hierarchal Channel Attention for Fine-grained Visual Classification [J].
Guan, Xiang ;
Wang, Guoqing ;
Xu, Xing ;
Bin, Yi .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :5011-5019