IntelPVT: intelligent patch-based pyramid vision transformers for object detection and classification

被引:0
作者
Divya Nimma
Zhaoxian Zhou
机构
[1] University of Southern Mississippi,School of Computing Sciences and Computer Engineering
来源
International Journal of Machine Learning and Cybernetics | 2024年 / 15卷
关键词
Vision transformer; Object detection; Object classification; Pyramid vision transformer; Adaptive patch; Intelligent method;
D O I
暂无
中图分类号
学科分类号
摘要
Since the advent of Transformers followed by Vision Transformers (ViTs), enormous success has been achieved by researchers in the field of computer vision and object detection. The difficulty mechanism of splitting images into fixed patches posed a serious challenge in this arena and resulted in loss of useful information at the time of object detection and classification. To overcome the challengers, we propose an innovative Intelligent-based patching mechanism and integrated it seamlessly into the conventional Patch-based ViT framework. The proposed method enables the utilization of patches with flexible sizes to capture and retain essential semantic content from input images and therefore increases the performance compared with conventional methods. Our method was evaluated with three renowned datasets Microsoft Common Objects in Context (MSCOCO-2017), Pascal VOC (Visual Object Classes Challenge) and Cityscapes upon object detection and classification. The experimental results showed promising improvements in specific metrics, particularly in higher confidence thresholds, making it a notable performer in object detection and classification tasks.
引用
收藏
页码:1767 / 1778
页数:11
相关论文
共 27 条
  • [1] Dhruv P(2020)Image classification using convolutional neural network (CNN) and recurrent neural network (RNN): a review Machine Learn Inform Proces: Proceed of ICMLIP 2019 367-381
  • [2] Naskar S(2021)Artificial neural networks training algorithm integrating invasive weed optimization with differential evolutionary model J Ambient Intell Human Comput 32 16091-16107
  • [3] Movassagh AA(2020)An optimal pruning algorithm of classifier ensembles: dynamic programming approach Neural Comput Appl 15 76-86
  • [4] Alzubi JA(2018)Consensus-based combining method for classifier ensembles Int Arab J Inf Technol 11 1336-1343
  • [5] Gheisari M(2015)Research article optimal classifier ensemble design based on cooperative game theory Res J Appl Sci Eng Technol 111 98-136
  • [6] Rahimi M(2015)The pascal visual object classes challenge: a retrospective Int J Comput Vision undefined undefined-undefined
  • [7] Mohan S(undefined)undefined undefined undefined undefined-undefined
  • [8] Abbasi AA(undefined)undefined undefined undefined undefined-undefined
  • [9] Nabipour N(undefined)undefined undefined undefined undefined-undefined
  • [10] Alzubi OA(undefined)undefined undefined undefined undefined-undefined