Synergistic Integration of Transfer Learning and Deep Learning for Enhanced Object Detection in Digital Images

被引:6
作者
Waheed, Safa Riyadh [1 ,2 ]
Suaib, Norhaida Mohd [1 ]
Rahim, Mohd Shafry Mohd [3 ]
Khan, Amjad Rehman [4 ]
Bahaj, Saeed Ali [5 ]
Saba, Tanzila [4 ]
机构
[1] Univ Teknol Malaysia, Fac Engn, Sch Comp, Skudai 81310, Johor Bahru, Malaysia
[2] Islamic Univ, Coll Tech Engn, Comp Tech Engn Dept, Najaf 54001, Iraq
[3] Univ Teknol Malaysia, Inst Human Ctr Engn, Media & Games Innovat Ctr Excellence, UTM IRDA Digital Media Ctr, Skudai 81310, Johor, Malaysia
[4] Prince Sultan Univ, Coll Comp & Informat Sci, Artificial Intelligence & Data Analyt Lab, Riyadh 11586, Saudi Arabia
[5] Prince Sattam Bin Abdulaziz Univ, Coll Business Adm, MIS Dept, Al Kharj 11942, Saudi Arabia
关键词
Object detection; Convolutional neural networks; Transfer learning; Feature extraction; Synthetic data; Smart cities; Training; TL; DL; SSMD; CNN; VGG16; smart city; security; technological development; NEURAL-NETWORKS;
D O I
10.1109/ACCESS.2024.3354706
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Presently, the world is progressing towards the notion of smart and secure cities. The automatic recognition of human activity is among the essential landmarks of smart city surveillance projects. Moreover, classifying group activity and behavior detection is complex and indistinct. Consequently, behavior classification systems reliant on visual data hold expansive utility across a spectrum of domains, including but not limited to video surveillance, human-computer interaction, and the safety infrastructure of smart cities. However, automatic behavior classification poses a significant challenge in the context of live videos captured by the smart city surveillance system. In this regard, the use of pictures with pre-trained convolution neural networks (CNNs)-assisted transfer learning (TL) has emerged as a potential technique for deep neural networks (DNNs) object detection., resulting in increased performance in localization for smart city surveillance. Against this backdrop, this paper explores various strategies to develop advanced synthetic datasets that could enhance accuracy when trained with modern DNNs for object detection (mAP). TL was employed to address the limitation of DL that necessitates a huge dataset. The KITTI datasets were used to train a contemporary DNN single-shot multiple box detector (SSMD) in TensorFlow. A variety of metrics were employed to assess the efficacy of the novel automated Transfer Learning (TL) system within a real-world context, specifically designed for object detection within the DL framework (referred to as OD-SSMD). The results unveiled that this developed system outperformed preceding investigations, demonstrating superior performance. Notably, it exhibited the remarkable capability to autonomously discern and pinpoint various attributes and entities within digital images, effectively identifying and localizing each item present within the images.
引用
收藏
页码:13525 / 13536
页数:12
相关论文
共 56 条
[1]   End-to-End Airplane Detection Using Transfer Learning in Remote Sensing Images [J].
Chen, Zhong ;
Zhang, Ting ;
Ouyang, Chao .
REMOTE SENSING, 2018, 10 (01)
[2]  
Davies E. R., 2021, ADV METHODS DEEP LEA
[3]   Scalable Object Detection using Deep Neural Networks [J].
Erhan, Dumitru ;
Szegedy, Christian ;
Toshev, Alexander ;
Anguelov, Dragomir .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2155-2162
[4]  
Felzenszwalb PedroF., 2008, IEEE C COMPUTER VISI
[5]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237
[6]  
Georgakis G, 2017, Arxiv, DOI arXiv:1702.07836
[7]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[8]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[9]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916
[10]   Parameters Compressing in Deep Learning [J].
He, Shiming ;
Li, Zhuozhou ;
Tang, Yangning ;
Liao, Zhuofan ;
Li, Feng ;
Lim, Se-Jung .
CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 62 (01) :321-336