HTC-Grasp: A Hybrid Transformer-CNN Architecture for Robotic Grasp Detection

被引:5
|
作者
Zhang, Qiang [1 ]
Zhu, Jianwei [1 ]
Sun, Xueying [1 ]
Liu, Mingmin [2 ]
机构
[1] Jiangsu Univ Sci & Technol, Sch Automat, 666 Changhui Rd, Zhenjiang 212100, Peoples R China
[2] SIASUN Robot & Automat Co Ltd, Cent Res Inst, 16 Jinhui St, Shenyang 110168, Peoples R China
基金
中国国家自然科学基金;
关键词
robotic grasp; transformer; attentional mechanism;
D O I
10.3390/electronics12061505
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurately detecting suitable grasp areas for unknown objects through visual information remains a challenging task. Drawing inspiration from the success of the Vision Transformer in vision detection, the hybrid Transformer-CNN architecture for robotic grasp detection, known as HTC-Grasp, is developed to improve the accuracy of grasping unknown objects. The architecture employs an external attention-based hierarchical Transformer as an encoder to effectively capture global context and correlation features across the entire dataset. Furthermore, a channel-wise attention-based CNN decoder is presented to adaptively adjust the weight of the channels in the approach, resulting in more efficient feature aggregation. The proposed method is validated on the Cornell and the Jacquard dataset, achieving an image-wise detection accuracy of 98.3% and 95.8% on each dataset, respectively. Additionally, the object-wise detection accuracy of 96.9% and 92.4% on the same datasets are achieved based on this method. A physical experiment is also performed using the Elite 6Dof robot, with a grasping accuracy rate of 93.3%, demonstrating the proposed method's ability to grasp unknown objects in real scenarios. The results of this study indicate that the proposed method outperforms other state-of-the-art methods.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Effective grasp detection method based on Swin transformer
    Zhang, Jing
    Tang, Yulin
    Luo, Yusong
    Du, Yukun
    Chen, Mingju
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (03) : 33008
  • [32] Hybrid transformer-CNN networks using superpixel segmentation for remote sensing building change detection
    Liang, Shike
    Hua, Zhen
    Li, Jinjiang
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (08) : 2754 - 2780
  • [33] Accurate Robotic Grasp Detection with Angular Label Smoothing
    Shi, Min
    Lu, Hao
    Li, Zhao-Xin
    Zhu, Deng-Ming
    Wang, Zhao-Qi
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2023, 38 (05) : 1149 - 1161
  • [34] FOTCA: hybrid transformer-CNN architecture using AFNO for accurate plant leaf disease image recognition
    Hu, Bo
    Jiang, Wenqian
    Zeng, Juan
    Cheng, Chen
    He, Laichang
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [35] CGNet: Robotic Grasp Detection in Heavily Cluttered Scenes
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2023, 28 (02) : 884 - 894
  • [36] Accurate Robotic Grasp Detection with Angular Label Smoothing
    Min Shi
    Hao Lu
    Zhao-Xin Li
    Deng-Ming Zhu
    Zhao-Qi Wang
    Journal of Computer Science and Technology, 2023, 38 : 1149 - 1161
  • [37] CNN-Transformer Hybrid Architecture for Early Fire Detection
    Yang, Chenyue
    Pan, Yixuan
    Cao, Yichao
    Lu, Xiaobo
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 570 - 581
  • [38] SKGNet: Robotic Grasp Detection With Selective Kernel Convolution
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, 20 (04) : 2241 - 2252
  • [39] Robotic Grasp Pose Detection Using Deep Learning
    Caldera, Shehan
    Rassau, Alexander
    Chai, Douglas
    2018 15TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2018, : 1966 - 1972
  • [40] Jacquard: A Large Scale Dataset for Robotic Grasp Detection
    Depierre, Amaury
    Dellandrea, Emmanuel
    Chen, Liming
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 3511 - 3516