ENHANCING MULTI-STEP ACTION PREDICTION FOR ACTIVE OBJECT DETECTION

被引：5

作者：

Fang, Fen ^{[1
]}

Xu, Qianli ^{[1
]}

Gauthier, Nicolas ^{[1
]}

Li, Liyuan ^{[1
]}

Lim, Joo-Hwee ^{[1
,2
]}

机构：

[1] ASTAR, Inst Infocomm Res, Singapore, Singapore

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2021年

关键词：

active object detection; reinforcement learning; view planning; deep q-learning network (DQN);

D O I：

10.1109/ICIP42928.2021.9506078

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Active vision for robots is one promising solution to open world visual detection problems. A fundamental issue is view planning, i.e., predicting next best views to capture images of interest to reduce uncertainty. While multi-step action in a reinforcement learning (RL) setup can boost the efficiency of view planning, existing methods suffer from unstable detection outcome when the Q-values of multiple branches of action advantages (i.e., action range and action type) are combined naively. To tackle this issue, we propose a novel mechanism to disentangle action range from action type through a two-stage training strategy on a deep Q-network. It combines well-crafted loss functions with respect to action range and action type to enforce separated training of these two branches. We evaluate our method on two public datasets and show that it facilitates substantial gain in view planning efficiency, while enhancing detection accuracy.

引用

页码：2189 / 2193

页数：5

共 23 条

[1]

Ammirato Phil, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P1378, DOI 10.1109/ICRA.2017.7989164

[2]

Ammirato Phil, 2017, 2017 IEEE INT C ROB, P1378

[3] Nonmyopic View Planning for Active Object Classification and Pose Estimation [J].

Atanasov, Nikolay ;

Sankaran, Bharath ;

Le Ny, Jerome ;

Pappas, George J. ;

Daniilidis, Kostas .

IEEE TRANSACTIONS ON ROBOTICS, 2014, 30 (05) :1078-1090

[4] Active Object Detection With Multistep Action Prediction Using Deep Q-Network [J].

Han, Xiaoning ;

Liu, Huaping ;

Sun, Fuchun ;

Zhang, Xinyu .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (06) :3723-3731

[5] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[6] T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects [J].

Hodan, Tomas ;

Haluza, Pavel ;

Obdrzalek, Stepan ;

Matas, Jiri ;

Lourakis, Manolis ;

Zabulis, Xenophon .

2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, :880-888

[7] SSD: Single Shot MultiBox Detector [J].

Liu, Wei ;

Anguelov, Dragomir ;

Erhan, Dumitru ;

Szegedy, Christian ;

Reed, Scott ;

Fu, Cheng-Yang ;

Berg, Alexander C. .

COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37

[8] Path Planning via an Improved DQN-Based Learning Policy [J].

Lv, Liangheng ;

Zhang, Sunjie ;

Ding, Derui ;

Wang, Yongxiong .

IEEE ACCESS, 2019, 7 :67319-67330

[9] Deep active object recognition by joint label and action prediction [J].

Malmir, Mohsen ;

Sikka, Karan ;

Forster, Deborah ;

Fasel, Ian ;

Movellan, Javier R. ;

Cottrell, Garrison W. .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 156 :128-137

[10]

Metz Luke, 2019, Discrete sequential prediction of continuous actions for deep rl

← 1 2 3 →