Learning Cross-Attention Point Transformer With Global Porous Sampling

被引:0
|
作者
Duan, Yueqi [1 ]
Sun, Haowen [2 ]
Yan, Juncheng [2 ]
Lu, Jiwen [2 ]
Zhou, Jie [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Global Positioning System; Convolution; Three-dimensional displays; Geometry; Feature extraction; Training data; Sun; Shape; Point cloud; 3D deep learning; transformer; cross-attention; NETWORK;
D O I
10.1109/TIP.2024.3486612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a point-based cross-attention transformer named CrossPoints with parametric Global Porous Sampling (GPS) strategy. The attention module is crucial to capture the correlations between different tokens for transformers. Most existing point-based transformers design multi-scale self-attention operations with down-sampled point clouds by the widely-used Farthest Point Sampling (FPS) strategy. However, FPS only generates sub-clouds with holistic structures, which fails to fully exploit the flexibility of points to generate diversified tokens for the attention module. To address this, we design a cross-attention module with parametric GPS and Complementary GPS (C-GPS) strategies to generate series of diversified tokens through controllable parameters. We show that FPS is a degenerated case of GPS, and the network learns more abundant relational information of the structure and geometry when we perform consecutive cross-attention over the tokens generated by GPS as well as C-GPS sampled points. More specifically, we set evenly-sampled points as queries and design our cross-attention layers with GPS and C-GPS sampled points as keys and values. In order to further improve the diversity of tokens, we design a deformable operation over points to adaptively adjust the points according to the input. Extensive experimental results on both shape classification and indoor scene segmentation tasks indicate promising boosts over the recent point cloud transformers. We also conduct ablation studies to show the effectiveness of our proposed cross-attention module with GPS strategy.
引用
收藏
页码:6283 / 6297
页数:15
相关论文
共 50 条
  • [41] DCCAT: Dual-Coordinate Cross-Attention Transformer for thrombus segmentation on coronary OCT
    Chu, Miao
    De Maria, Giovanni Luigi
    Dai, Ruobing
    Benenati, Stefano
    Yu, Wei
    Zhong, Jiaxin
    Kotronias, Rafail
    Walsh, Jason
    Andreaggi, Stefano
    Zuccarelli, Vittorio
    Chai, Jason
    Channon, Keith
    Banning, Adrian
    Tu, Shengxian
    MEDICAL IMAGE ANALYSIS, 2024, 97
  • [42] CerviFormer: A pap smear-based cervical cancer classification method using cross-attention and latent transformer
    Deo, Bhaswati Singha
    Pal, Mayukha
    Panigrahi, Prasanta K.
    Pradhan, Asima
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (02)
  • [43] Twins transformer: rolling bearing fault diagnosis based on cross-attention fusion of time and frequency domain features
    Gao, Zhikang
    Wang, Yanxue
    Li, Xinming
    Yao, Jiachi
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (09)
  • [44] Hyperspectral Image Classification via Cascaded Spatial Cross-Attention Network
    Zhang, Bo
    Chen, Yaxiong
    Xiong, Shengwu
    Lu, Xiaoqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 899 - 913
  • [45] Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning
    Dasgupta, Agnibh
    Thong, Xin
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1125 - 1132
  • [46] A Cross-Attention Multi-Scale Performer With Gaussian Bit-Flips for File Fragment Classification
    Liu, Sisung
    Park, Jeong Gyu
    Kim, Hyeongsik
    Hong, Je Hyeong
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 2109 - 2121
  • [47] GazeSymCAT: A symmetric cross-attention transformer for robust gaze estimation under extreme head poses and gaze variations
    Zhong, Yupeng
    Lee, Sang Hun
    JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2025, 12 (03) : 115 - 129
  • [48] Input-output Driven Cross-Attention for Transformer for Quality Prediction of Light Naphtha in Industrial Hydrocracking Processes
    Yang, Ziyi
    Yuan, Xiaofeng
    Wang, Kai
    Chen, Zhiwen
    Wang, Yalin
    Yang, Chunhua
    Gui, Weihua
    IFAC PAPERSONLINE, 2024, 58 (14): : 85 - 90
  • [49] CAT-DTI: cross-attention and Transformer network with domain adaptation for drug-target interaction prediction
    Zeng, Xiaoting
    Chen, Weilin
    Lei, Baiying
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [50] Video question answering via grounded cross-attention network learning
    Ye, Yunan
    Zhang, Shifeng
    Li, Yimeng
    Qian, Xufeng
    Tang, Siliang
    Pu, Shiliang
    Xiao, Jun
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)