Learning Geometric Information via Transformer Network for Key-Points Based Motion Segmentation

被引：0

作者：

Li, Qiming ^{[1
]}

Cheng, Jinghang ^{[1
]}

Gao, Yin ^{[1
]}

Li, Jun ^{[1
]}

机构：

[1] Chinese Acad Sci, Haixi Inst, Quanzhou Inst Equipment Mfg, Lab Robot & Intelligent Syst, Quanzhou 362216, Fujian, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Geometric information embedding; transformer; self-attention; motion segmentation; VIDEO OBJECT SEGMENTATION; MULTIPLE-STRUCTURE DATA; CONSENSUS; TRACKING; GRAPHS;

D O I：

10.1109/TCSVT.2024.3382363

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With the emergence of Vision Transformers, attention-based modules have demonstrated comparable or superior performance in comparison to CNNs on various vision tasks. However, limited research has been conducted to explore the potential of the self-attention module in learning the global and local geometric information for key-points based motion segmentation. This paper thus presents a new method, named GIET, that utilizes geometric information in the Transformer network for key-points based motion segmentation. Specifically, two novel local geometric information embedding modules are developed in GIET. Unlike the traditional convolution operators which model the local geometric information of key-points within a fixed-size spatial neighbourhood, we develop a Neighbor Embedding Module (NEM) by aggregating the feature maps of k-Nearest Neighbors (k-NN) for each point according to the semantics similarity between the input key-points. NEM not only augments the network's ability of local feature extraction of the points' neighborhoods, but also characterizes the semantic affinities between points in the same moving object. Furthermore, to investigate the geometric relationships between the points and each motion, a Centroid Embedding Module (CEM) is devised to aggregate the feature maps of cluster centroids that correspond to the moving objects. CEM can effectively capture the semantic similarity between points and the centroids corresponding to the moving objects. Subsequently, the multi-head self-attention mechanism is exploited to learn the global geometric information of all the key-points using the aggregated feature maps obtained from the two embedding modules. Compared to the convolution operators or self-attention mechanism, the proposed simple Transformer-like architecture can optimally utilize both the local and global geometric properties of the input sparse key-points. Finally, the motion segmentation task is formulated as a subspace clustering problem using the Transformer architecture. The experimental results on three motion segmentation datasets, including KT3DMoSeg, AdelaideRMF, and FBMS, demonstrate that GIET achieves state-of-the-art performance.

引用

页码：7856 / 7869

页数：14

共 50 条

[1] Geometric Learning-Based Transformer Network for Estimation of Segmentation Errors
Sree, Sneha
Al Fahim, Mohammad
Ram, Keerthi
Sivaprakasam, Mohanasankar
SHAPE IN MEDICAL IMAGING, SHAPEMI 2023, 2023, 14350 : 118 - 132
[2] Defect detection of printed circuit board based on adaptive key-points localization network
Yu, Jianbo
Zhao, Lixiang
Wang, Yanshu
Ge, Yifan
COMPUTERS & INDUSTRIAL ENGINEERING, 2024, 193
[3] An Efficient Copy-Move Detection Algorithm Based on Superpixel Segmentation and Harris Key-Points
Liu, Yong
Wang, Hong-Xia
Wu, Han-Zhou
Chen, Yi
CLOUD COMPUTING AND SECURITY, PT I, 2017, 10602
[4] Multi-domain Information Fusion for Key-Points Guided GAN Inversion
Xu, Ruize
Qiu, Xiaowen
He, Boan
Ge, Weifeng
Zhang, Wenqiang
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XI, 2024, 14435 : 146 - 157
[5] Similarity Comparison of Segmentation Based on Key-points in Real-ESRGAN Super-resolution Satellite SAR Images
Park, Changhan
Journal of Institute of Control, Robotics and Systems, 2024, 30 (08) : 853 - 862
[6] A Key-Points Based Anchor-Free Cervical Cell Detector
Shu, Tong
Shi, Jun
Zheng, Yushan
Jiang, Zhiguo
Yu, Lanlan
2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
[7] A Novel Key-Points Based Shapelets Transform for Time Series Classification
Peng, Manman
Luo, Jun
2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017, : 2268 - 2273
[8] SSCK-Net: Spine segmentation in MRI based on cross attention and key-points recognition-assisted learner
Li, Haiyan
Wang, Zhixin
Shen, Wei
Li, Huilin
Li, Hongsong
Yu, Pengfei
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 86
[9] Learning Instance Motion Segmentation With Geometric Embedding
Leng, Zhen
Chen, Jing
Lin, Songnan
IEEE ACCESS, 2021, 9 : 56812 - 56821
[10] Key points based segmentation of lips
Eveno, N
Caplier, A
Coulon, PY
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A125 - A128

← 1 2 3 4 5 →