Deep Learning Network for Pedestrian Attribute Recognition Based on Dynamic Multi-Task Balancing

被引：0

作者：

Sun Z. ^{[1
]}

Ye J. ^{[1
]}

Wang T. ^{[1
]}

Lei L. ^{[2
]}

Lian J. ^{[3
]}

Li Y. ^{[4
]}

机构：

[1] Key Laboratory of Optoelectronic Technology and Systems of the Ministry of Education, Chongqing University, Chongqing

[2] School of Electronic Information Engineering, Yangtze Normal University, Chongqing

[3] Beijing Dilusense Technology Co, Ltd, Beijing

[4] Nanjing Pioneer Awareness Information Technology Co., Ltd, Nanjing

来源：

Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2019年 / 31卷 / 12期

关键词：

Attribute recognition; Deep learning; Loss function; Multi-task learning;

D O I：

10.3724/SP.J.1089.2019.17654

中图分类号：

学科分类号：

摘要：

Person attribute recognition extracts structured feature of person, which plays a vital role in intelligent video surveillance, such as person re-identification. Firstly, based on R*CNN, we design an end-to-end multi-attribute recognition method based on deep learning network. The region proposal network (RPN) rather than selective search is employed to extract auxiliary regions. An unified network for auxiliary region extraction and attribute recognition is constructed to improve locally attributes. Secondly, in order to enhance the effects of auxiliary region, we split the body ROI into four regions proportionately, such as whole body, head, torso and leg. Each region is in charge of different attributes. And the network splits into four branches at the prediction stage. The primary regions and the second important auxiliary regions are exploited to predict attributes simultaneously. At last, the dynamic adapting loss weighting has the ability to balance the contribution of every task and achieve an optimum performance. That is, the loss weights are inversely correlated with the gradient of loss function, which is to avoiding a certain task is training too fast or too slow. The comparison experiments are elaborated on the Berkeley Attributes of People dataset, an optimum mean average precision (mAP) more than 92% is obtained when compared with state-of-the-art methods. © 2019, Beijing China Science Journal Publishing Co. Ltd. All right reserved.

引用

页码：2144 / 2151

页数：7

共 19 条

[1]

Zhang N., Paluri M., Ranzato M., Et al., PANDA: pose aligned networks for deep attribute modeling, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637-1644, (2014)

[2]

Gkioxari G., Girshick R., Malik J., Actions and attributes from wholes and parts, Proceedings of the IEEE International Conference on Computer Vision, pp. 2470-2478, (2015)

[3]

Li Y.N., Huang C., Loy C.C., Et al., Human attribute recognition by deep hierarchical contexts, Proceedings of the European Conference on Computer Vision, pp. 684-700, (2016)

[4]

Wang J.Y., Zhu X.T., Gong S.G., Et al., Attribute recognition by joint recurrent learning of context and correlation, Proceedings of the IEEE International Conference on Computer Vision, pp. 531-540, (2017)

[5]

Gkioxari G., Girshick R., Malik J., Contextual action recognition with R*CNN, Proceedings of the IEEE International Conference on Computer Vision, pp. 1080-1088, (2015)

[6]

Uijlings J.R.R., Van De Sande K.E.A., Gevers T., Et al., Selective search for object recognition, International Journal of Computer Vision, 104, 2, pp. 154-171, (2013)

[7]

Liu X.H., Zhao H.Y., Tian M.Q., Et al., HydraPlus-Net: attentive deep features for pedestrian analysis, Proceedings of the IEEE International Conference on Computer Vision, pp. 350-359, (2017)

[8]

Li D.W., Chen X.T., Zhang Z., Et al., Pose guided deep model for pedestrian attribute recognition in surveillance scenarios, Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 1-6, (2018)

[9]

Yu Y., Yin G., Yin Y., Et al., Defect recognition for radiographic image based on deep learning network, Chinese Journal of Scientific Instrument, 35, 9, pp. 2012-2019, (2014)

[10]

Yang Y.X., Hospedales T., Deep multi-task representation learning: a tensor factorization approach

← 1 2 →