Deep Label Distribution Learning With Label Ambiguity

被引：388

作者：

Gao, Bin-Bin ^{[1
]}

Xing, Chao ^{[2
]}

Xie, Chen-Wei ^{[1
]}

Wu, Jianxin ^{[1
]}

Geng, Xin ^{[2
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China

[2] Southeast Univ, Sch Comp Sci & Engn, MOE Key Lab Comp Network & Informat Integrat, Nanjing 211189, Jiangsu, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2017年 / 26卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Label distribution; deep learning; age estimation; head pose estimation; semantic segmentation; HEAD POSE ESTIMATION; AGE; IMAGE;

D O I：

10.1109/TIP.2017.2689998

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional neural networks (ConvNets) have achieved excellent recognition performance in various visual recognition tasks. A large labeled training set is one of the most important factors for its success. However, it is difficult to collect sufficient training images with precise labels in some domains, such as apparent age estimation, head pose estimation, multilabel classification, and semantic segmentation. Fortunately, there is ambiguous information among labels, which makes these tasks different from traditional classification. Based on this observation, we convert the label of each image into a discrete label distribution, and learn the label distribution by minimizing a Kullback-Leibler divergence between the predicted and ground-truth label distributions using deep ConvNets. The proposed deep label distribution learning (DLDL) method effectively utilizes the label ambiguity in both feature learning and classifier learning, which help prevent the network from overfitting even when the training set is small. Experimental results show that the proposed approach produces significantly better results than the state-of-the-art methods for age estimation and head pose estimation. At the same time, it also improves recognition performance for multi-label classification and semantic segmentation tasks.

引用

页码：2825 / 2838

页数：14

共 52 条

[1] Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network [J].

Ahn, Byungtae ;

Park, Jaesik ;

Kweon, In So .

COMPUTER VISION - ACCV 2014, PT III, 2015, 9005 :82-96

[2]

[Anonymous], 2011, ADV NEURAL INF PROCE

[3]

[Anonymous], PROC CVPR IEEE

[4]

[Anonymous], 2015, PROC CVPR IEEE

[5]

[Anonymous], DEEP CONVOLUTIONAL R

[6]

[Anonymous], CNN JUN

[7]

[Anonymous], MSRTR20051735

[8] Robust Optimization for Deep Regression [J].

Belagiannis, Vasileios ;

Rupprecht, Christian ;

Carneiro, Gustavo ;

Navab, Nassir .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2830-2838

[9] A Learning Framework for Age Rank Estimation Based on Face Images With Scattering Transform [J].

Chang, Kuang-Yu ;

Chen, Chu-Song .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (03) :785-798

[10]

Chen L, 2015, BRILL SER MOD E ASIA, V3, P1, DOI 10.1163/9789004288492_002

← 1 2 3 4 5 6 →