DeepProduct: Mobile Product Search With Portable Deep Features

被引:17
作者
Jiang, Yu-Gang [1 ]
Li, Minjun [1 ]
Wang, Xi [1 ]
Liu, Wei [2 ]
Hua, Xian-Sheng [3 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, 825 Zhangheng Rd, Shanghai, Peoples R China
[2] Columbia Univ, New York, NY USA
[3] Alibaba Grp, Hangzhou, Zhejiang, Peoples R China
关键词
Mobile product search; deep learning; efficiency; contrastive loss; RETRIEVAL;
D O I
10.1145/3184745
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Features extracted by deep networks have been popular in many visual search tasks. This article studies deep network structures and training schemes for mobile visual search. The goal is to learn an effective yet portable feature representation that is suitable for bridging the domain gap between mobile user photos and (mostly) professionally taken product images while keeping the computational cost acceptable for mobile-based applications. The technical contributions are twofold. First, we propose an alternative of the contrastive loss popularly used for training deep Siamese networks, namely robust contrastive loss, where we relax the penalty on some positive and negative pairs to alleviate overfitting. Second, a simple multitask fine-tuning scheme is leveraged to train the network which not only utilizes knowledge from the provided training photo pairs but also harnesses additional information from the large ImageNet dataset to regularize the fine-tuning process, extensive experiments on challenging real-world datasets demonstrate that both the robust contrastive loss and the multitask fine-tuning scheme are effective, leading to very promising results with a time cost suitable for mobile product search scenarios.
引用
收藏
页数:18
相关论文
共 33 条
[1]  
[Anonymous], 2013, Decaf: A deep convolutional activation feature for generic visual recognition
[2]  
[Anonymous], P INT C COMP VIS
[3]   Learning visual similarity for product design with convolutional neural networks [J].
Bell, Sean ;
Bala, Kavita .
ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04)
[4]  
Chechik G, 2010, J MACH LEARN RES, V11, P1109
[5]   High Prevalence of Assisted Injection Among Street-Involved Youth in a Canadian Setting [J].
Cheng, Tessa ;
Kerr, Thomas ;
Small, Will ;
Dong, Huiru ;
Montaner, Julio ;
Wood, Evan ;
DeBeck, Kora .
AIDS AND BEHAVIOR, 2016, 20 (02) :377-384
[6]   Learning a similarity metric discriminatively, with application to face verification [J].
Chopra, S ;
Hadsell, R ;
LeCun, Y .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :539-546
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   Learning Hierarchical Features for Scene Labeling [J].
Farabet, Clement ;
Couprie, Camille ;
Najman, Laurent ;
LeCun, Yann .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1915-1929
[9]  
Hadsell R., 2006, IEEE C COMPUT VIS PA, P1735, DOI DOI 10.1109/CVPR.2006.100
[10]  
He JF, 2012, PROC CVPR IEEE, P3005, DOI 10.1109/CVPR.2012.6248030