Webly Supervised Fine-Grained Image Recognition with Graph Representation and Metric Learning

被引:1
作者
Lin, Jianman [1 ]
Lin, Jiantao [2 ]
Gao, Yuefang [3 ]
Yang, Zhijing [4 ]
Chen, Tianshui [4 ]
机构
[1] Guangdong Univ Technol, Sch Mech & Elect Engn, Guangzhou 510006, Peoples R China
[2] Jinan Univ, Sch Intelligent Sci & Engn, Zhuhai 519077, Peoples R China
[3] South China Agr Univ, Coll Math & Informat, Guangzhou 510642, Peoples R China
[4] Guangdong Univ Technol, Sch Informat Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
webly supervised learning; fine-grained image recognition; graph representation learning; graph metric learning; noisy data;
D O I
10.3390/electronics11244127
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The aim of webly supervised fine-grained image recognition (FGIR) is to distinguish sub-ordinate categories based on data retrieved from the Internet, which can significantly mitigate the dependence of deep learning on manually annotated labels. Most current fine-grained image recognition algorithms use a large-scale data-driven deep learning paradigm, which relies heavily on manually annotated labels. However, there is a large amount of weakly labeled free data on the Internet. To utilize fine-grained web data effectively, this paper proposes a Graph Representation and Metric Learning (GRML) framework to learn discriminative and effective holistic-local features by graph representation for web fine-grained images and to handle noisy labels simultaneously, thus effectively using webly supervised data for training. Specifically, we first design an attention-focused module to locate the most discriminative region with different spatial aspects and sizes. Next, a structured instance graph is constructed to correlate holistic and local features to model the holistic-local information interaction, while a graph prototype that contains both holistic and local information for each category is introduced to learn category-level graph representation to assist in processing the noisy labels. Finally, a graph matching module is further employed to explore the holistic-local information interaction through intra-graph node information propagation as well as to evaluate the similarity score between each instance graph and its corresponding category-level graph prototype through inter-graph node information propagation. Extensive experiments were conducted on three webly supervised FGIR benchmark datasets, Web-Bird, Web-Aircraft and Web-Car, with classification accuracy of 76.62%, 85.79% and 82.99%, respectively. In comparison with Peer-learning, the classification accuracies of the three datasets separately improved 2.47%, 4.72% and 1.59%.
引用
收藏
页数:12
相关论文
共 20 条
[1]  
CatherineWah Steve Branson, 2011, Tech. rep. CNS-TR-2011-001
[2]   Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning [J].
Chen, Tianshui ;
Pu, Tao ;
Wu, Hefeng ;
Xie, Yuan ;
Liu, Lingbo ;
Lin, Liang .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) :9887-9903
[3]   Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition [J].
Chen, Tianshui ;
Lin, Liang ;
Chen, Riquan ;
Hui, Xiaolu ;
Wu, Hefeng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) :1371-1384
[4]   Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels [J].
Han, Bo ;
Yao, Quanming ;
Yu, Xingrui ;
Niu, Gang ;
Xu, Miao ;
Hu, Weihua ;
Tsang, Ivor W. ;
Sugiyama, Masashi .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[5]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[6]   3D Object Representations for Fine-Grained Categorization [J].
Krause, Jonathan ;
Stark, Michael ;
Deng, Jia ;
Li Fei-Fei .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, :554-561
[7]  
Li J., 2020, arXiv
[8]  
Liu JX, 2012, LECT NOTES COMPUT SC, V7572, P172, DOI 10.1007/978-3-642-33718-5_13
[9]  
Liu Xiao, 2016, arXiv
[10]  
Maji Subhransu, 2013, arXiv