Learning Cross-Attention Point Transformer With Global Porous Sampling

被引：0

作者：

Duan, Yueqi ^{[1
]}

Sun, Haowen ^{[2
]}

Yan, Juncheng ^{[2
]}

Lu, Jiwen ^{[2
]}

Zhou, Jie ^{[2
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China

[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

基金：

中国国家自然科学基金;

关键词：

Point cloud compression; Transformers; Global Positioning System; Convolution; Three-dimensional displays; Geometry; Feature extraction; Training data; Sun; Shape; Point cloud; 3D deep learning; transformer; cross-attention; NETWORK;

D O I：

10.1109/TIP.2024.3486612

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a point-based cross-attention transformer named CrossPoints with parametric Global Porous Sampling (GPS) strategy. The attention module is crucial to capture the correlations between different tokens for transformers. Most existing point-based transformers design multi-scale self-attention operations with down-sampled point clouds by the widely-used Farthest Point Sampling (FPS) strategy. However, FPS only generates sub-clouds with holistic structures, which fails to fully exploit the flexibility of points to generate diversified tokens for the attention module. To address this, we design a cross-attention module with parametric GPS and Complementary GPS (C-GPS) strategies to generate series of diversified tokens through controllable parameters. We show that FPS is a degenerated case of GPS, and the network learns more abundant relational information of the structure and geometry when we perform consecutive cross-attention over the tokens generated by GPS as well as C-GPS sampled points. More specifically, we set evenly-sampled points as queries and design our cross-attention layers with GPS and C-GPS sampled points as keys and values. In order to further improve the diversity of tokens, we design a deformable operation over points to adaptively adjust the points according to the input. Extensive experimental results on both shape classification and indoor scene segmentation tasks indicate promising boosts over the recent point cloud transformers. We also conduct ablation studies to show the effectiveness of our proposed cross-attention module with GPS strategy.

引用

页码：6283 / 6297

页数：15

共 50 条

[41] DCCAT: Dual-Coordinate Cross-Attention Transformer for thrombus segmentation on coronary OCT
Chu, Miao
De Maria, Giovanni Luigi
Dai, Ruobing
Benenati, Stefano
Yu, Wei
Zhong, Jiaxin
Kotronias, Rafail
Walsh, Jason
Andreaggi, Stefano
Zuccarelli, Vittorio
Chai, Jason
Channon, Keith
Banning, Adrian
Tu, Shengxian
MEDICAL IMAGE ANALYSIS, 2024, 97
[42] CerviFormer: A pap smear-based cervical cancer classification method using cross-attention and latent transformer
Deo, Bhaswati Singha
Pal, Mayukha
Panigrahi, Prasanta K.
Pradhan, Asima
INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (02)
[43] Twins transformer: rolling bearing fault diagnosis based on cross-attention fusion of time and frequency domain features
Gao, Zhikang
Wang, Yanxue
Li, Xinming
Yao, Jiachi
MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (09)
[44] Hyperspectral Image Classification via Cascaded Spatial Cross-Attention Network
Zhang, Bo
Chen, Yaxiong
Xiong, Shengwu
Lu, Xiaoqiang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 899 - 913
[45] Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning
Dasgupta, Agnibh
Thong, Xin
2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1125 - 1132
[46] A Cross-Attention Multi-Scale Performer With Gaussian Bit-Flips for File Fragment Classification
Liu, Sisung
Park, Jeong Gyu
Kim, Hyeongsik
Hong, Je Hyeong
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 2109 - 2121
[47] GazeSymCAT: A symmetric cross-attention transformer for robust gaze estimation under extreme head poses and gaze variations
Zhong, Yupeng
Lee, Sang Hun
JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2025, 12 (03) : 115 - 129
[48] Input-output Driven Cross-Attention for Transformer for Quality Prediction of Light Naphtha in Industrial Hydrocracking Processes
Yang, Ziyi
Yuan, Xiaofeng
Wang, Kai
Chen, Zhiwen
Wang, Yalin
Yang, Chunhua
Gui, Weihua
IFAC PAPERSONLINE, 2024, 58 (14): : 85 - 90
[49] CAT-DTI: cross-attention and Transformer network with domain adaptation for drug-target interaction prediction
Zeng, Xiaoting
Chen, Weilin
Lei, Baiying
BMC BIOINFORMATICS, 2024, 25 (01)
[50] Video question answering via grounded cross-attention network learning
Ye, Yunan
Zhang, Shifeng
Li, Yimeng
Qian, Xufeng
Tang, Siliang
Pu, Shiliang
Xiao, Jun
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)

← 1 2 3 4 5 →