Vision Transformer-based Real-Time Camouflaged Object Detection System at Edge

被引：2

作者：

Putatunda, Rohan ^{[1
]}

Khan, Md Azim ^{[1
]}

Gangopadhyay, Aryya ^{[1
]}

Wang, Jianwu ^{[1
]}

Busart, Carl ^{[2
]}

Erbacher, Robert F. ^{[2
]}

机构：

[1] Univ Maryland Baltimore Cty, Dept Informat Syst, Baltimore, MD 21228 USA

[2] DEVCOM Army Res Lab, Adelphi, MD USA

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING, SMARTCOMP | 2023年

关键词：

Camouflaged Object Detection; Multi-Modality; Vision Transformer; GRAD-CAM;

D O I：

10.1109/SMARTCOMP58114.2023.00029

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Camouflaged object detection is a challenging task in computer vision that involves identifying objects that are intentionally or unintentionally hidden in their surrounding environment. Vision Transformer mechanisms play a critical role in improving the performance of deep learning models by focusing on the most relevant features that help object detection under camouflaged conditions. In this paper, we utilized a vision transformer (VT) in two phases, a) By integrating VT with a deep learning architecture for efficient monocular depth map generation for camouflaged objects and b) By embedding VT multiclass object detection model with multimodal feature input (RGB with RGB-D) that increases the visual cues and provides more representational information to the model for performance enhancement. Additionally, we performed an ablation study to understand the role of the vision transformer in camouflaged object detection and incorporated GRAD-CAM on top of the model to visualize the performance improvement achieved by embedding the VT in the model architecture. We deployed the model on resource-constrained edge devices for real-time object detection to realistically test the performance of the trained model.

引用

页码：90 / 97

页数：8

共 50 条

[21] Edge Perception Camouflaged Object Detection Under Frequency Domain Reconstruction [J].

Liu, Zijian ;

Deng, Xiaoheng ;

Jiang, Ping ;

Lv, Conghao ;

Min, Geyong ;

Wang, Xin .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) :10194-10207

[22] Two guidance joint network based on coarse map and edge map for camouflaged object detection [J].

Tang, Zhe ;

Tang, Jing ;

Zou, Dengpeng ;

Rao, Junyi ;

Qi, Fang .

APPLIED INTELLIGENCE, 2024, 54 (15-16) :7531-7544

[23] Focal DETR: Target-Aware Token Design for Transformer-Based Object Detection [J].

Xie, Tianming ;

Zhang, Zhonghao ;

Tian, Jing ;

Ma, Lihong .

SENSORS, 2022, 22 (22)

[24] AnoViT: Unsupervised Anomaly Detection and Localization With Vision Transformer-Based Encoder-Decoder [J].

Lee, Yunseung ;

Kang, Pilsung .

IEEE ACCESS, 2022, 10 :46717-46724

[25] MaxCerVixT: A novel lightweight vision transformer-based Approach for precise cervical cancer detection [J].

Pacal, Ishak .

KNOWLEDGE-BASED SYSTEMS, 2024, 289

[26] Vision Transformer-based recognition of diabetic retinopathy grade [J].

Wu, Jianfang ;

Hu, Ruo ;

Xiao, Zhenghong ;

Chen, Jiaxu ;

Liu, Jingwei .

MEDICAL PHYSICS, 2021, 48 (12) :7850-7863

[27] Hierarchical Graph Interaction Transformer With Dynamic Token Clustering for Camouflaged Object Detection [J].

Yao, Siyuan ;

Sun, Hao ;

Xiang, Tian-Zhu ;

Wang, Xiao ;

Cao, Xiaochun .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 :5936-5948

[28] Strawberry disease identification with vision transformer-based models [J].

Nguyen, Hai Thanh ;

Tran, Tri Dac ;

Nguyen, Thanh Tuong ;

Pham, Nhi Minh ;

Nguyen Ly, Phuc Hoang ;

Luong, Huong Hoang .

MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (29) :73101-73126

[29] RT-DEKT: real-time object detector with KAN-Transformer [J].

Jin, Zhanao ;

Li, Changlu ;

Lei, Zhichun .

SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (06)

[30] Camouflaged Object Detection Based on Ternary Cascade Perception [J].

Jiang, Xinhao ;

Cai, Wei ;

Ding, Yao ;

Wang, Xin ;

Yang, Zhiyong ;

Di, Xingyu ;

Gao, Weijie .

REMOTE SENSING, 2023, 15 (05)

← 1 2 3 4 5 →