Granulated RCNN and Multi-Class Deep SORT for Multi-Object Detection and Tracking

被引:101
作者
Pramanik, Anima [1 ]
Pal, Sankar K. [2 ]
Maiti, J. [1 ]
Mitra, Pabitra [3 ]
机构
[1] Indian Inst Technol Kharagpur, Ind & Syst Engn, Kharagpur 721302, W Bengal, India
[2] ISI, Soft Comp, Kolkata 700108, India
[3] IIT Kharagpur, Comp Sci & Engn, Kharagpur 721302, W Bengal, India
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2022年 / 6卷 / 01期
关键词
Videos; Detectors; Feature extraction; Proposals; Computational modeling; Steel; Object detection; Deep CNN; Foreground region proposal; Granulation; Object detection and tracking; Video analysis; SAR IMAGES; ENTROPY;
D O I
10.1109/TETCI.2020.3041019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, two new models, namely granulated RCNN (G-RCNN) and multi-class deep SORT (MCD-SORT), for object detection and tracking, respectively from videos are developed. Object detection has two stages: object localization (region of interest RoI) and classification. G-RCNN is an improved version of the well-known Fast RCNN and Faster RCNN for extracting RoIs by incorporating the unique concept of granulation in a deep convolutional neural network. Granulation with spatio-temporal information enables more accurate extraction of RoIs (object regions) in unsupervised mode. Compared to Fast and Faster RCNNs, G-RCNN uses (i) granules (clusters) formed over the pooling feature map, instead of its all feature values, in defining RoIs, (ii) only the positive RoIs during training, instead of the whole RoI-map, (iii) videos directly as input, rather than static images, and (iv) only the objects in RoIs, instead of the entire feature map, for performing object classification. All these lead to the improvement in real-time detection speed and accuracy. MCD-SORT is an advanced form of the popular Deep SORT. In MCD-SORT, the searching for association of objects with trajectories is restricted only within the same categories. This increases the performance in multi-class tracking. These characteristic features have been demonstrated over 37 videos containing single-class, two-class, and multi-class objects. Superiority of the models over several state-of-the-art methodologies is also established extensively, both qualitatively and quantitatively.
引用
收藏
页码:171 / 181
页数:11
相关论文
共 31 条
[21]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149
[22]   Tracking The Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies [J].
Sadeghian, Amir ;
Alahi, Alexandre ;
Savarese, Silvio .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :300-311
[23]   Change detection in SAR images using deep belief network: a new training approach based on morphological images [J].
Samadi, Farnaam ;
Akbarizadeh, Gholamreza ;
Kaabi, Hooman .
IET IMAGE PROCESSING, 2019, 13 (12) :2255-2264
[24]   Ship Classification in SAR Images Using a New Hybrid CNN-MLP Classifier [J].
Sharifzadeh, Foroogh ;
Akbarizadeh, Gholamreza ;
Kavian, Yousef Seifi .
JOURNAL OF THE INDIAN SOCIETY OF REMOTE SENSING, 2019, 47 (04) :551-562
[25]   Multiobject Tracking by Submodular Optimization [J].
Shen, Jianbing ;
Liang, Zhiyuan ;
Liu, Jianhong ;
Sun, Hanqiu ;
Shao, Ling ;
Tao, Dacheng .
IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (06) :1990-2001
[26]  
Sun S., Scientific Data
[27]   PolSAR image segmentation based on feature extraction and data compression using Weighted Neighborhood Filter Bank and Hidden Markov random field-expectation maximization [J].
Tirandaz, Zeinab ;
Akbarizadeh, Gholamreza ;
Kaabi, Hooman .
MEASUREMENT, 2020, 153
[28]   Selective Search for Object Recognition [J].
Uijlings, J. R. R. ;
van de Sande, K. E. A. ;
Gevers, T. ;
Smeulders, A. W. M. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 104 (02) :154-171
[29]  
Wojke N, 2017, IEEE IMAGE PROC, P3645, DOI 10.1109/ICIP.2017.8296962
[30]   A Biologically Inspired Appearance Model for Robust Visual Tracking [J].
Zhang, Shengping ;
Lan, Xiangyuan ;
Yao, Hongxun ;
Zhou, Huiyu ;
Tao, Dacheng ;
Li, Xuelong .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (10) :2357-2370