Pedestrian Street-Cross Action Recognition in Monocular Far Infrared Sequences

被引:33
作者
Brehar, Raluca Didona [1 ]
Muresan, Mircea Paul [1 ]
Marita, Tiberiu [1 ]
Vancea, Cristian-Cosmin [1 ]
Negru, Mihai [1 ]
Nedevschi, Sergiu [1 ]
机构
[1] Tech Univ Cluj Napoca, Dept Comp Sci, Cluj Napoca 400114, Romania
关键词
Cameras; Roads; Image recognition; Deep learning; Radar tracking; Feature extraction; Finite impulse response filters; Image processing; neural network; pattern recognition; night vision applications; FLIR camera; pedestrian detection; pedestrian tracking; semantic segmentation; time series analysis; NETWORKS; TRACKING;
D O I
10.1109/ACCESS.2021.3080822
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The early recognition and understanding of the actions performed by pedestrians in traffic scenes leads to an anticipation of pedestrian intentions in advance and helps in the process of collision warning and avoidance in the context of autonomous vehicles. An environment with low visibility conditions such as night-time, fog, heavy rain or smoke increases the number of difficult situations in traffic. A complete and original model for assessing if a pedestrian is engaged in a street cross action using only infrared monocular scene perception is proposed in this paper. The assessment of a street cross action is done by the time series analysis of features like: pedestrian motion, position of pedestrians with respect to the drivable area and their distance with respect to the ego-vehicle. The extraction of these features emerges from the combination of a deep learning based pedestrian detector with an original tracking algorithm, a semantic segmentation of the road surface and a time series long-short term memory network based action recognition. In order to validate the proposed method we introduce a new dataset named CROSSIR. It is formed of pedestrian annotations, action annotations and semantic labels for the road. The CROSSIR dataset is suitable for several common computer vision algorithms: (1) pedestrian detection and tracking algorithms because each pedestrian has a unique identifier over the frames in which it appears; (2) pedestrian action recognition; (3) semantic segmentation of the road pixels in the infrared image.
引用
收藏
页码:74302 / 74324
页数:23
相关论文
共 78 条
[1]  
[Anonymous], 2017, P IEEE 20 INT C INT
[2]  
[Anonymous], 2015, ACS SYM SER
[3]  
[Anonymous], 2018, FLIR THERM DAT ALG T
[4]  
Asvadi A, 2016, 2016 IEEE 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), P1255, DOI 10.1109/ITSC.2016.7795718
[5]   Staple: Complementary Learners for Real-Time Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Golodetz, Stuart ;
Miksik, Ondrej ;
Torr, Philip H. S. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1401-1409
[6]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[7]  
Bertozzi M, 2004, 2004 IEEE INTELLIGENT VEHICLES SYMPOSIUM, P584
[8]  
Binelli E, 2005, 2005 IEEE INTELLIGENT VEHICLES SYMPOSIUM PROCEEDINGS, P759
[9]  
Brehar R, 2019, INT C INTELL COMP CO, P207, DOI 10.1109/ICCP48234.2019.8959763
[10]  
Brehar R, 2016, INT C INTELL COMP CO, P263, DOI 10.1109/ICCP.2016.7737157