Multi-task Deep Learning for Image Understanding

被引：0

作者：

Yu, Bo ^{[1
,2
,3
]}

Lane, Ian ^{[3
]}

机构：

[1] Chinese Acad Sci, Inst Remote Sensing & Digital Earth, State Key Lab Remote Sensing Sci, Beijing 100101, Peoples R China

[2] Chinese Acad Sci, Grad Univ, Beijing 100049, Peoples R China

[3] Carnegie Mellon Univ, Moffett Field, CA 94043 USA

来源：

2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR) | 2014年

关键词：

image segmentation; deep learning; multi-task learning; FACE DETECTION;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Deep learning models can obtain state-of-the-art performance across many speech and image processing tasks, often significantly outperforming earlier methods. In this paper, we attempt to further improve the performance of these models by introducing multi-task training, in which a combined deep learning model is trained for two inter-related tasks. We show that by introducing a secondary task (such as shape identification in the object classification task) we are able to significantly improve the performance of the main task for which the model is trained. Using public datasets we evaluated our approach on two image understanding tasks, image segmentation and object classification. On the image segmentation task, we observed that the multi-task model almost doubled the accuracy of segmentation at the pixel-level (from 18.7% to 35.6%) compared to the single task model, and improved the performance of face-detection by 10.2% (from 70.1% to 80.3%). For the object classification task, we observed a 2.1% improvement in classification accuracy (from 91.6% to 93.7%) compared to a single-task model. The proposed multi-task models obtained significantly higher accuracies than previously published results on these datasets, obtaining 22.0% and 6.2% higher accuracies on the face-detetction and object classification tasks respectively. These results demonstrate the effectiveness of multi-task training of deep learning models for image understanding tasks.

引用

页码：37 / 42

页数：6

共 29 条

[1]

[Anonymous], EUR WORKSH 3D OBJ RE

[2]

[Anonymous], INT C ART NEUR NETW

[3]

Bo LF, 2011, PROC CVPR IEEE, P1729, DOI 10.1109/CVPR.2011.5995719

[4]

Caruana R, 1998, LECT NOTES COMPUT SC, V1524, P165

[5] Multitask learning [J].

Caruana, R .

MACHINE LEARNING, 1997, 28 (01) :41-75

[6] An evaluation of multimodal 2D+3D face biometrics [J].

Chang, KI ;

Bowyer, KW ;

Flynn, PJ .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (04) :619-624

[7] 3D face detection using curvature analysis [J].

Colombo, A ;

Cusano, C ;

Schettini, R .

PATTERN RECOGNITION, 2006, 39 (03) :444-455

[8]

Davis J., 2006, P 23 INT C MACH LEAR, P233, DOI [10.1145/1143844.1143874, DOI 10.1145/1143844.1143874]

[9] IMPROVING MODEL SELECTION BY NONCONVERGENT METHODS [J].

FINNOFF, W ;

HERGERT, F ;

ZIMMERMANN, HG .

NEURAL NETWORKS, 1993, 6 (06) :771-783

[10] An RGB-D Database Using Microsoft's Kinect for Windows for Face Detection [J].

Hg, R. I. ;

Jasek, P. ;

Rofidal, C. ;

Nasrollahi, K. ;

Moeslund, T. B. ;

Tranchet, G. .

8TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS 2012), 2012, :42-46

← 1 2 3 →