Condition-CNN: A hierarchical multi-label fashion image classification model

被引：40

作者：

Kolisnik, Brendan ^{[1
]}

Hogan, Isaac ^{[1
]}

Zulkernine, Farhana ^{[1
]}

机构：

[1] Queens Univ, Sch Comp, Kingston, ON K7L 2N8, Canada

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2021年 / 182卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Condition-CNN; Branching convolutional neural networks; Image classification; Convolutional neural networks; Hierarchical image classification; Teacher Forcing;

D O I：

10.1016/j.eswa.2021.115195

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current state of the art image classifiers predict a single class label of an image. However, in many industry settings such as online shopping, images belong to a class hierarchy where the first level represents the coarse grained or the most abstract class with subsequent levels representing the more specific classes. We propose a novel hierarchical image classification model, Condition-CNN, which addresses some of the shortcomings of the branching convolutional neural network in terms of training time and fine-grained accuracy. It applies the Teacher Forcing training algorithm, where the actual class labels of the higher level classes rather than the predicted labels are used to train the lower level branches. The technique also prevents error propagation, and thereby, reduces the training time. Besides learning the image features for each level of classes, Condition-CNN also learns the relationship between different levels of classes as conditional probabilities, which is used to estimate class predictions during scoring. By feeding the estimated higher-level class predictions as priors to the lower-level class prediction, Condition-CNN achieves a superior prediction accuracy while requiring fewer trainable parameters compared to the baseline CNN models. The validation results of Condition-CNN using the Kaggle Fashion Product Images data set demonstrate a prediction accuracy of 99.8%, 98.1%, and 91.0% for Level 1, 2 and 3 classes respectively, which are greater than that of B-CNN and other baseline CNN models. Moreover, Condition-CNN used only 77.58% of the total number of trainable parameters as that of B-CNN.

引用

页数：14

共 24 条

[1] Aggarwal P., 2019, FASHION PRODUCT IMAG
[2] [Anonymous], 2016, P IEEE WINT C APPL C
[3] SURF: Speeded up robust features
Bay, Herbert
Tuytelaars, Tinne
Van Gool, Luc
[J]. COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 : 404 - 417
[4] Chen Q, 2015, PROC CVPR IEEE, P5315, DOI 10.1109/CVPR.2015.7299169
[5] Leveraging Class Hierarchy in Fashion Classification
Cho, Hyunsoo
Ahn, Chaemin
Yoo, Kang Min
Seol, Jinseok
Lee, Sang-goo
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3197 - 3200
[6] Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction
Corbiere, Charles
Ben-Younes, Hedi
Rame, Alexandre
Ollion, Charles
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 2268 - 2274
[7] Gasmallah MH, 2018, 2018 IEEE 9TH ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (IEMCON), P365, DOI 10.1109/IEMCON.2018.8615054
[8] Grand View Research, 2020, COMM SEAW MARK AN PR
[9] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
[J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
[10] Li PZ, 2019, IEEE IMAGE PROC, P3038, DOI [10.1109/ICIP.2019.8803394, 10.1109/icip.2019.8803394]

← 1 2 3 →