Self-Supervised Reinforcement Learning for Active Object Detection

被引：8

作者：

Fang, Fen ^{[1
]}

Liang, Wenyu ^{[1
]}

Wu, Yan ^{[1
]}

Xu, Qianli ^{[1
]}

Lim, Joo-Hwee ^{[1
]}

机构：

[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2022年 / 7卷 / 04期

关键词：

Active perception; active object detection; path planing; self-supervised learning; reinforcement learning; RECOGNITION;

D O I：

10.1109/LRA.2022.3193019

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Active object detection (AOD) offers significant advantage in expanding the perceptual capacity of a robotics system. AOD is formulated as a sequential action decision process to determine optimal viewpoints to identify objects of interest in a visual scene. While reinforcement learning (RL) has been successfully used to solve many AOD problems, conventional RL methods suffer from (i) sample inefficiency, and (ii) unstable outcome due to inter-dependencies of action type (direction of view change) and action range (step size of view change). To address these issues, we propose a novel self-supervised RL method, which employs self-supervised representations of viewpoints to initialize the policy network, and a self-supervised loss on action range to enhance the network parameter optimization. The output and target pairs of self-supervised learning loss are automatically generated from the policy network online prediction and a range shrinkage algorithm (RSA), respectively. The proposed method is evaluated and benchmarked on two public datasets (T-LESS and AVD) using on-policy and off-policy RL algorithms. The results show that our method enhances detection accuracy and achieves faster convergence on both datasets. By evaluating on a more complex environment with a larger state space (where viewpoints are more densely sampled), our method achieves more robust and stable performance. Our experiment on real robot application scenario to disambiguate similar objects in a cluttered scene has also demonstrated the effectiveness of the proposed method.

引用

页码：10224 / 10231

页数：8

共 36 条

[1]

Ammirato Phil, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P1378, DOI 10.1109/ICRA.2017.7989164

[2]

[Anonymous], 2015, ICLR

[3]

[Anonymous], 1994, CUED/FINENG/, Tech. Rep. 166

[4]

Arbel T., 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision, P248, DOI 10.1109/ICCV.1999.791227

[5] A Survey of Viewpoint Selection Methods for Polygonal Models [J].

Bonaventura, Xavier ;

Feixas, Miquel ;

Sbert, Mateu ;

Chuang, Lewis ;

Wallraven, Christian .

ENTROPY, 2018, 20 (05)

[6]

Chen T, 2020, PR MACH LEARN RES, V119

[7] Information theoretic sensor data selection for active object recognition and, state estimation [J].

Denzler, J ;

Brown, CM .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (02) :145-157

[8] Unsupervised Visual Representation Learning by Context Prediction [J].

Doersch, Carl ;

Gupta, Abhinav ;

Efros, Alexei A. .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1422-1430

[9] ENHANCING MULTI-STEP ACTION PREDICTION FOR ACTIVE OBJECT DETECTION [J].

Fang, Fen ;

Xu, Qianli ;

Gauthier, Nicolas ;

Li, Liyuan ;

Lim, Joo-Hwee .

2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, :2189-2193

[10] Scaling and Benchmarking Self-Supervised Visual Representation Learning [J].

Goyal, Priya ;

Mahajan, Dhruv ;

Gupta, Abhinav ;

Misra, Ishan .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6400-6409

← 1 2 3 4 →