Granulated RCNN and Multi-Class Deep SORT for Multi-Object Detection and Tracking

被引：101

作者：

Pramanik, Anima ^{[1
]}

Pal, Sankar K. ^{[2
]}

Maiti, J. ^{[1
]}

Mitra, Pabitra ^{[3
]}

机构：

[1] Indian Inst Technol Kharagpur, Ind & Syst Engn, Kharagpur 721302, W Bengal, India

[2] ISI, Soft Comp, Kolkata 700108, India

[3] IIT Kharagpur, Comp Sci & Engn, Kharagpur 721302, W Bengal, India

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2022年 / 6卷 / 01期

关键词：

Videos; Detectors; Feature extraction; Proposals; Computational modeling; Steel; Object detection; Deep CNN; Foreground region proposal; Granulation; Object detection and tracking; Video analysis; SAR IMAGES; ENTROPY;

D O I：

10.1109/TETCI.2020.3041019

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article, two new models, namely granulated RCNN (G-RCNN) and multi-class deep SORT (MCD-SORT), for object detection and tracking, respectively from videos are developed. Object detection has two stages: object localization (region of interest RoI) and classification. G-RCNN is an improved version of the well-known Fast RCNN and Faster RCNN for extracting RoIs by incorporating the unique concept of granulation in a deep convolutional neural network. Granulation with spatio-temporal information enables more accurate extraction of RoIs (object regions) in unsupervised mode. Compared to Fast and Faster RCNNs, G-RCNN uses (i) granules (clusters) formed over the pooling feature map, instead of its all feature values, in defining RoIs, (ii) only the positive RoIs during training, instead of the whole RoI-map, (iii) videos directly as input, rather than static images, and (iv) only the objects in RoIs, instead of the entire feature map, for performing object classification. All these lead to the improvement in real-time detection speed and accuracy. MCD-SORT is an advanced form of the popular Deep SORT. In MCD-SORT, the searching for association of objects with trajectories is restricted only within the same categories. This increases the performance in multi-class tracking. These characteristic features have been demonstrated over 37 videos containing single-class, two-class, and multi-class objects. Superiority of the models over several state-of-the-art methodologies is also established extensively, both qualitatively and quantitatively.

引用

页码：171 / 181

页数：11

共 31 条

[21] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].