A Smart Dual-modal Aligned Transformer Deep Network for Robotic Grasp Detection

被引：0

作者：

Cang, Xin ^{[1
]}

Zhang, Haojun ^{[1
]}

Yang, Yuequan ^{[1
]}

Cao, Zhiqiang ^{[2
]}

Li, Fudong ^{[1
]}

Zhu, Jiaming ^{[1
]}

机构：

[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Jiangsu, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

来源：

2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

Dual modalities; Feature alignment; Robotic grasping; Transformer;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Robotic grasp is one of crucial visual tasks for service robots as well as industrial robots. The existing deep vision learning approaches for robotic grasp most utilize RGB-D as single modality or indiscriminating usage of them, which often overlook the valuable depth information in RGB-D images. To address this limitation, this paper proposes a smart dual-modal aligned transformer deep network (SATNet), which is not only very lightweight but also well performed for robotic grasping tasks using RGB-D images. Specifically, a novel ATFormer module with the two parallel aligned transformer encoder blocks are elaborated to fuse global feature maps efficiently. The experiments on Cornell dataset demonstrate that the proposed model outperforms existing methods, which not only enjoys impressively lightweight framework with only 0.27M parameters, but also achieves accuracy of 97.8% and inference time of 16.3ms.

引用

页码：1230 / 1235

页数：6

共 41 条

[21] Robotic Objects Detection and Grasping in Clutter Based on Cascaded Deep Convolutional Neural Network
Liu, Dong
Tao, Xiantong
Yuan, Liheng
Du, Yu
Cong, Ming
[J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
[22] Multimodal transformer graph convolution attention isomorphism network (MTCGAIN): a novel deep network for detection of insomnia disorder
Wang, Yulong
Ren, Yande
Bi, Yuzhen
Zhao, Feng
Bai, Xingzhen
Wei, Liangzhou
Liu, Wanting
Ma, Hancheng
Bai, Peirui
[J]. QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2024, 14 (05) : 3350 - 3365
[23] A transformer-based deep neural network for arrhythmia detection using continuous ECG signals
Hu, Rui
Chen, Jie
Zhou, Li
[J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 144
[24] A Multi-Scale Transformer Fusion Deep Clustering Network for Unsupervised Planetary Change Detection
Jia, Yutong
Wan, Gang
Liu, Jia
Zhao, Chenxu
Wang, Guoping
Zhang, Yifan
Liu, Lei
Xie, Bin
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
[25] A Dual Transformer-Based Deep Learning Model for Passenger Anomaly Behavior Detection in Elevator Cabs
Ji, Yijin
Sun, Haoxiang
Xu, Benlian
Lu, Mingli
Zhou, Xu
Shi, Jian
[J]. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH, 2024, 15 (01)
[26] Computer-assisted diagnosis for axillary lymph node metastasis of early breast cancer based on transformer with dual-modal adaptive mid-term fusion using ultrasound elastography
Gong, Chihao
Wu, Yinglan
Zhang, Guangyuan
Liu, Xuan
Zhu, Xiaoyao
Cai, Nian
Li, Jian
[J]. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2025, 119
[27] Transformer guidance dual-stream network for salient object detection in optical remote sensing images
Zhang, Yi
Guo, Jichang
Yue, Huihui
Yin, Xiangjun
Zheng, Sida
[J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (24) : 17733 - 17747
[28] Transformer guidance dual-stream network for salient object detection in optical remote sensing images
Yi Zhang
Jichang Guo
Huihui Yue
Xiangjun Yin
Sida Zheng
[J]. Neural Computing and Applications, 2023, 35 : 17733 - 17747
[29] Remote Sensing Image Change Detection Transformer Network Based on Dual-Feature Mixed Attention
Song, Xinyang
Hua, Zhen
Li, Jinjiang
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[30] Unifying convolution and transformer: a dual stage network equipped with cross-interactive multi-modal feature fusion and edge guidance for RGB-D salient object detection
Abraham S.E.
Kovoor B.C.
[J]. Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (04) : 2341 - 2359

← 1 2 3 4 5 →