Webly Supervised Fine-Grained Image Recognition with Graph Representation and Metric Learning

被引：1

作者：

Lin, Jianman ^{[1
]}

Lin, Jiantao ^{[2
]}

Gao, Yuefang ^{[3
]}

Yang, Zhijing ^{[4
]}

Chen, Tianshui ^{[4
]}

机构：

[1] Guangdong Univ Technol, Sch Mech & Elect Engn, Guangzhou 510006, Peoples R China

[2] Jinan Univ, Sch Intelligent Sci & Engn, Zhuhai 519077, Peoples R China

[3] South China Agr Univ, Coll Math & Informat, Guangzhou 510642, Peoples R China

[4] Guangdong Univ Technol, Sch Informat Engn, Guangzhou 510006, Peoples R China

来源：

ELECTRONICS | 2022年 / 11卷 / 24期

基金：

中国国家自然科学基金;

关键词：

webly supervised learning; fine-grained image recognition; graph representation learning; graph metric learning; noisy data;

D O I：

10.3390/electronics11244127

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The aim of webly supervised fine-grained image recognition (FGIR) is to distinguish sub-ordinate categories based on data retrieved from the Internet, which can significantly mitigate the dependence of deep learning on manually annotated labels. Most current fine-grained image recognition algorithms use a large-scale data-driven deep learning paradigm, which relies heavily on manually annotated labels. However, there is a large amount of weakly labeled free data on the Internet. To utilize fine-grained web data effectively, this paper proposes a Graph Representation and Metric Learning (GRML) framework to learn discriminative and effective holistic-local features by graph representation for web fine-grained images and to handle noisy labels simultaneously, thus effectively using webly supervised data for training. Specifically, we first design an attention-focused module to locate the most discriminative region with different spatial aspects and sizes. Next, a structured instance graph is constructed to correlate holistic and local features to model the holistic-local information interaction, while a graph prototype that contains both holistic and local information for each category is introduced to learn category-level graph representation to assist in processing the noisy labels. Finally, a graph matching module is further employed to explore the holistic-local information interaction through intra-graph node information propagation as well as to evaluate the similarity score between each instance graph and its corresponding category-level graph prototype through inter-graph node information propagation. Extensive experiments were conducted on three webly supervised FGIR benchmark datasets, Web-Bird, Web-Aircraft and Web-Car, with classification accuracy of 76.62%, 85.79% and 82.99%, respectively. In comparison with Peer-learning, the classification accuracies of the three datasets separately improved 2.47%, 4.72% and 1.59%.

引用

页数：12

共 20 条

[1]

CatherineWah Steve Branson, 2011, Tech. rep. CNS-TR-2011-001

[2] Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning [J].

Chen, Tianshui ;

Pu, Tao ;

Wu, Hefeng ;

Xie, Yuan ;

Liu, Lingbo ;

Lin, Liang .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) :9887-9903

[3] Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition [J].

Chen, Tianshui ;

Lin, Liang ;

Chen, Riquan ;

Hui, Xiaolu ;

Wu, Hefeng .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) :1371-1384

[4] Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels [J].

Han, Bo ;

Yao, Quanming ;

Yu, Xingrui ;

Niu, Gang ;

Xu, Miao ;

Hu, Weihua ;

Tsang, Ivor W. ;

Sugiyama, Masashi .

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

[5] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[6] 3D Object Representations for Fine-Grained Categorization [J].

Krause, Jonathan ;

Stark, Michael ;

Deng, Jia ;

Li Fei-Fei .

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, :554-561

[7]

Li J., 2020, arXiv

[8]

Liu JX, 2012, LECT NOTES COMPUT SC, V7572, P172, DOI 10.1007/978-3-642-33718-5_13

[9]

Liu Xiao, 2016, arXiv

[10]

Maji Subhransu, 2013, arXiv

← 1 2 →