Deep Orientation: Fast and Robust Upper Body Orientation Estimation for Mobile Robotic Applications

被引：0

作者：

Lewandowski, Benjamin ^{[1
]}

Seichter, Daniel ^{[1
]}

Wengefeld, Tim ^{[1
]}

Pfennig, Lennard ^{[1
]}

Drumm, Helge ^{[2
]}

Gross, Horst-Michael ^{[1
]}

机构：

[1] Tech Univ Ilmenau, Neuroinformat & Cognit Robot Lab, D-98694 Ilmenau, Germany

[2] Tech Univ Ilmenau, Univ Comp Ctr, D-98694 Ilmenau, Germany

来源：

2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2019年

关键词：

D O I：

10.1109/iros40897.2019.8968506

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An essential feature for navigating socially with a mobile robot is the upper body orientation of persons in its vicinity. For example, in a supermarket orientation indicates whether a person is looking at goods on the shelves or where a person is likely to go. However, given limited computing and battery capabilities, it is not possible to rely on high-performance graphics cards to run large, computationally expensive deep neural networks for orientation estimation in real time. Nevertheless, deep learning performs quite well for regression problems. Therefore, we tackle the problem of upper body orientation estimation with small yet efficient deep neural networks on a mobile robot in this paper. We employ a fast person detection approach as preprocessing that outputs fixed size person images before the actual estimation of the orientation is done. The combination with lightweight networks allows us to estimate a continuous angle in real time, even using a CPU only. We experimentally evaluate the performance of our system on a new, self-recorded data set consisting of more than 100,000 RGB-D samples from 37 persons, which is made publicly available. We also do an extensive comparison of different network architectures and output encodings for their applicability in estimating orientations. Furthermore, we show that depth images are more suitable for the task of orientation estimation than RGB images or the combination of both.

引用

页码：441 / 448

页数：8

共 40 条

[1]

Abadi M., 2015, TENSORFLOW LARGE SCA

[2]

Advanced Realtime Tracking GmbH, 2018, ART DTRACK2

[3] Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network [J].

Ahn, Byungtae ;

Park, Jaesik ;

Kweon, In So .

COMPUTER VISION - ACCV 2014, PT III, 2015, 9005 :82-96

[4]

[Anonymous], NIPS WORKSH BAYES DE

[5]

[Anonymous], 2016, ARXIV160901984

[6] Biternion Nets: Continuous Head Pose Regression from Discrete Training Labels [J].

Beyer, Lucas ;

Hermans, Alexander ;

Leibe, Bastian .

PATTERN RECOGNITION, GCPR 2015, 2015, 9358 :157-168

[7] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].

Cao, Zhe ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310

[8]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[9]

Duchi J, 2011, J MACH LEARN RES, V12, P2121

[10]

Fitte-Duval Laurent, 2015, 10th International Conference on Computer Vision Theory and Applications (VISAPP 2015). Proceedings, P439

← 1 2 3 4 →