Self-Supervise Reinforcement Learning Method for Vacant Parking Space Detection Based on Task Consistency and Corrupted Rewards

被引:0
作者
Nguyen, Manh-Hung [1 ]
Chao, Tzu-Yin [2 ]
Hsiao, Ching-Chun [2 ]
Li, Yung-Hui [3 ]
Huang, Ching-Chun [2 ]
机构
[1] HCM Univ Technol & Educ, Fac Elect & Elect Engn, Ho Chi Minh City 700000, Vietnam
[2] Natl Yang Ming Chiao Tung Univ, Dept Comp Sci, Hsinchu 300093, Taiwan
[3] Hon Hai Res Inst, AI Res Ctr, New Taipei City 114699, Taiwan
关键词
Task-consistency; reinforcement learning; learn from corrupted rewards; domain transfer; MODEL; INFERENCE;
D O I
10.1109/TITS.2023.3319531
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
This paper proposes a novel task-consistency learning method that enables us to train a vacant space detection network (target task) based on the logic consistency with the semantic outcomes from a flow-based motion behavior classifier (source task) in a parking lot. Note that the source task can introduce false detection during task-consistency learning, which implies noisy rewards or supervision. The target network can be trained in a reinforcement learning setting by appropriately designing the reward mechanism upon semantic consistency. We also introduce a novel symmetric constraint to detect corrupted samples and reduce the effect of noisy rewards. Unlike conventional corrupted learning methods that use only training losses to identify corrupted samples, our symmetric constraint also explores the relationship among training samples to improve performance. Compared with conventional supervised detection methods, the main contribution of our work is the ability to learn a vacant space detector via semantic consistency rather than supervised labels. The dynamic learning property allows the proposed detector to be easily deployed and updated in various lots without heavy human loads. Experiments demonstrate that our noisy task consistency mechanism can be successfully applied to train a vacant space detector from scratch.
引用
收藏
页码:1346 / 1363
页数:18
相关论文
共 42 条
[31]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[32]  
Sutton RS, 2000, ADV NEUR IN, V12, P1057
[33]   Rethinking the Inception Architecture for Computer Vision [J].
Szegedy, Christian ;
Vanhoucke, Vincent ;
Ioffe, Sergey ;
Shlens, Jon ;
Wojna, Zbigniew .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2818-2826
[34]  
Takahama T., 2012, INNOV PARALLEL COMPU, P1
[35]   Deep Progressive Reinforcement Learning for Skeleton-based Action Recognition [J].
Tang, Yansong ;
Tian, Yi ;
Lu, Jiwen ;
Li, Peiyang ;
Zhou, Jie .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5323-5332
[36]   A method of Parking space detection based on image segmentation and LBP [J].
Wang Lixia ;
Jiang Dalin .
2012 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY (MINES 2012), 2012, :229-232
[37]   A survey of transfer learning [J].
Weiss K. ;
Khoshgoftaar T.M. ;
Wang D.D. .
Journal of Big Data, 2016, 3 (01)
[38]   Reducing Estimation Bias via Triplet-Average Deep Deterministic Policy Gradient [J].
Wu, Dongming ;
Dong, Xingping ;
Shen, Jianbing ;
Hoi, Steven C. H. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) :4933-4945
[39]  
Wu Q, 2007, 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, P659
[40]   Taskonomy: Disentangling Task Transfer Learning [J].
Zamir, Amir R. ;
Sax, Alexander ;
Shen, William ;
Guibas, Leonidas ;
Malik, Jitendra ;
Savarese, Silvio .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3712-3722