Learning Cross-Attention Point Transformer With Global Porous Sampling

被引:0
|
作者
Duan, Yueqi [1 ]
Sun, Haowen [2 ]
Yan, Juncheng [2 ]
Lu, Jiwen [2 ]
Zhou, Jie [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Global Positioning System; Convolution; Three-dimensional displays; Geometry; Feature extraction; Training data; Sun; Shape; Point cloud; 3D deep learning; transformer; cross-attention; NETWORK;
D O I
10.1109/TIP.2024.3486612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a point-based cross-attention transformer named CrossPoints with parametric Global Porous Sampling (GPS) strategy. The attention module is crucial to capture the correlations between different tokens for transformers. Most existing point-based transformers design multi-scale self-attention operations with down-sampled point clouds by the widely-used Farthest Point Sampling (FPS) strategy. However, FPS only generates sub-clouds with holistic structures, which fails to fully exploit the flexibility of points to generate diversified tokens for the attention module. To address this, we design a cross-attention module with parametric GPS and Complementary GPS (C-GPS) strategies to generate series of diversified tokens through controllable parameters. We show that FPS is a degenerated case of GPS, and the network learns more abundant relational information of the structure and geometry when we perform consecutive cross-attention over the tokens generated by GPS as well as C-GPS sampled points. More specifically, we set evenly-sampled points as queries and design our cross-attention layers with GPS and C-GPS sampled points as keys and values. In order to further improve the diversity of tokens, we design a deformable operation over points to adaptively adjust the points according to the input. Extensive experimental results on both shape classification and indoor scene segmentation tasks indicate promising boosts over the recent point cloud transformers. We also conduct ablation studies to show the effectiveness of our proposed cross-attention module with GPS strategy.
引用
收藏
页码:6283 / 6297
页数:15
相关论文
共 50 条
  • [21] Asymmetric Cross-Attention Hierarchical Network Based on CNN and Transformer for Bitemporal Remote Sensing Images Change Detection
    Zhang, Xiaofeng
    Cheng, Shuli
    Wang, Liejun
    Li, Haojin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [22] COCCA: POINT CLOUD COMPLETION THROUGH CAD CROSS-ATTENTION
    Misik, Adam
    Salihu, Driton
    Brock, Heike
    Steinbach, Eckehard
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 580 - 584
  • [23] Enhanced Local Feature Learning With Simple Offset Attention for Semantic Segmentation of Large-Scale Point Clouds
    Chen, Dong
    Wang, Yuebin
    Zhang, Liqiang
    Kang, Zhizhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [24] CAMCFormer: Cross-Attention and Multicorrelation Aided Transformer for Few-Shot Object Detection in Optical Remote Sensing Images
    Wang, Lefan
    Mei, Shaohui
    Wang, Yi
    Lian, Jiawei
    Han, Zonghao
    Feng, Yan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [25] Interactive CNN and Transformer-Based Cross-Attention Fusion Network for Medical Image Classification
    Cai, Shu
    Zhang, Qiude
    Wang, Shanshan
    Hu, Junjie
    Zeng, Liang
    Li, Kaiyan
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2025, 35 (03)
  • [26] DeepCoVDR: deep transfer learning with graph transformer and cross-attention for predicting COVID-19 drug response
    Huang, Zhijian
    Zhang, Pan
    Deng, Lei
    BIOINFORMATICS, 2023, 39 : i475 - i483
  • [27] Image-text multimodal classification via cross-attention contextual transformer with modality-collaborative learning
    Shi, Qianyao
    Xu, Wanru
    Miao, Zhenjiang
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (04)
  • [28] ABC-Trans: a novel adaptive border-augmented cross-attention transformer for object detection
    Qianjun Zhang
    Pan Wang
    Zihao Wu
    Binhong Yang
    Jin Yuan
    Multimedia Tools and Applications, 2025, 84 (16) : 15671 - 15688
  • [29] A Novel Transformer Network With Shifted Window Cross-Attention for Spatiotemporal Weather Forecasting
    Bojesomo, Alabi
    Almarzouqi, Hasan
    Liatsis, Panos
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 45 - 55
  • [30] A cross-attention integrated shifted window transformer for remote sensing image scene recognition with limited data
    Li, Kaiyuan
    Xue, Yong
    Zhao, Jiaqi
    Li, Honghao
    Zhang, Sheng
    JOURNAL OF APPLIED REMOTE SENSING, 2024, 18 (03)