Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop

被引:149
|
作者
Cui, Yin [1 ,2 ]
Zhou, Feng [3 ]
Lin, Yuanqing [3 ]
Belongie, Serge [1 ,2 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
[2] Cornell Tech, New York, NY 10011 USA
[3] NEC Labs Amer, Princeton, NJ USA
关键词
D O I
10.1109/CVPR.2016.130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing fine-grained visual categorization methods often suffer from three challenges: lack of training data, large number of fine-grained categories, and high intraclass vs. low inter-class variance. In this work we propose a generic iterative framework for fine-grained categorization and dataset bootstrapping that handles these three challenges. Using deep metric learning with humans in the loop, we learn a low dimensional feature embedding with anchor points on manifolds for each category. These anchor points capture intra-class variances and remain discriminative between classes. In each round, images with high confidence scores from our model are sent to humans for labeling. By comparing with exemplar images, labelers mark each candidate image as either a "true positive" or a "false positive." True positives are added into our current dataset and false positives are regarded as "hard negatives" for our metric learning model. Then the model is retrained with an expanded dataset and hard negatives for the next round. To demonstrate the effectiveness of the proposed framework, we bootstrap a fine-grained flower dataset with 620 categories from Instagram images. The proposed deep metric learning scheme is evaluated on both our dataset and the CUB-200-2001 Birds dataset. Experimental evaluations show significant performance gain using dataset bootstrapping and demonstrate state-of-the-art results achieved by the proposed deep metric learning methods.
引用
收藏
页码:1153 / 1162
页数:10
相关论文
共 50 条
  • [21] Fine-Grained Road Quality Monitoring Using Deep Learning
    Siddiqui, Ifrah
    Mazhar, Suleman
    Hassan, Naufil
    Sultani, Waqas
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (10) : 10691 - 10701
  • [22] Cross-X Learning for Fine-Grained Visual Categorization
    Luo, Wei
    Yang, Xitong
    Mo, Xianjie
    Lu, Yuheng
    Davis, Larry S.
    Li, Jun
    Yang, Jian
    Lim, Ser-Nam
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8241 - 8250
  • [23] ADAPTIVE MULTI-TASK LEARNING FOR FINE-GRAINED CATEGORIZATION
    Sun, Gang
    Chen, Yanyun
    Liu, Xuehui
    Wu, Enhua
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 996 - 1000
  • [24] Universal Fine-Grained Visual Categorization by Concept Guided Learning
    Bi, Qi
    Zhou, Beichen
    Ji, Wei
    Xia, Gui-Song
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 394 - 409
  • [25] Multilevel Similarity-Aware Deep Metric Learning for Fine-Grained Image Retrieval
    Duan, Congcong
    Feng, Yong
    Zhou, Mingliang
    Xiong, Xiancai
    Wang, Yongheng
    Qiang, Baohua
    Jia, Weijia
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (08) : 9173 - 9182
  • [26] Local Alignments for Fine-Grained Categorization
    Efstratios Gavves
    Basura Fernando
    Cees G. M. Snoek
    Arnold W. M. Smeulders
    Tinne Tuytelaars
    International Journal of Computer Vision, 2015, 111 : 191 - 212
  • [27] Local Alignments for Fine-Grained Categorization
    Gavves, Efstratios
    Fernando, Basura
    Snoek, Cees G. M.
    Smeulders, Arnold W. M.
    Tuytelaars, Tinne
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (02) : 191 - 212
  • [28] A Survey of Fine-Grained Image Categorization
    Zheng, Min
    Li, Qingyong
    Geng, Yangli-ao
    Yu, Haomin
    Wang, Jianzhu
    Gan, Jinrui
    Xue, Wenyuan
    PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 533 - 538
  • [29] To Know and To Learn About the Integration of Knowledge Representation and Deep Learning for Fine-Grained Visual Categorization
    Setti, Francesco
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2018), VOL 5: VISAPP, 2018, : 387 - 392
  • [30] Fine-grained Video Attractiveness Prediction Using Multimodal Deep Learning on a Large Real-world Dataset
    Chen, Xinpeng
    Chen, Jingyuan
    Ma, Lin
    Yao, Jian
    Liu, Wei
    Luo, Jiebo
    Zhang, Tong
    COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 671 - 678