EGA-Net: Edge Guided Attention Network With Label Refinement for Parsing of Animal Body Parts

被引:0
作者
Raghavendra, S. [1 ]
Abhilash, S. K. [2 ]
Nookala, Venu Madhav [2 ]
Girisha, S. [3 ]
Adesh, N. D. [1 ]
机构
[1] Manipal Inst Technol, Manipal Acad Higher Educ, Dept Informat & Commun Technol, Manipal 576104, India
[2] KPIT Technol, Bengaluru 560103, India
[3] Manipal Inst Technol, Manipal Acad Higher Educ, Dept Data Sci & Comp Applicat, Manipal 576104, India
关键词
Transformers; Accuracy; Animals; Semantics; Feature extraction; Quadrupedal robots; Semantic segmentation; Image edge detection; Training; Decoding; Edge-guided; part segmentation; quadruped animals; semantic segmentation; transformer;
D O I
10.1109/ACCESS.2024.3471948
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In computer vision, semantic segmentation precisely delineates objects at the pixel level. This fundamental idea is constantly evolving by adding new modules and adjustments to suit the unique characteristics of different object classes. Pixel-level semantic segmentation is an intricate and computationally intensive task, especially within the context of part-based approaches. The study proposes a transformer-based attention network that is edge-guided and developed for the precise partitioning of different parts of quadruped animals. The process of labeling masks at the pixel level is a challenging task for various object categories, owing to its inherent complexity, which often results in inaccurate annotations. An additional mechanism is used to enhance pixel-level accuracy between classes, which iteratively refines labels. The model is evaluated using the PascalPart and PartImageNet datasets, using various scales of transformer architectures. Performance is evaluated using metrics such as mean Intersection-over-Union (mIoU), Pixel Accuracy (PA), and mean Accuracy (mA). Ablation studies are conducted to evaluate the model's performance based on network parameters, while the effectiveness of each component is assessed using Class Activation Maps (CAM). The results show a notable 8% improvement in mIoU scores over existing state-of-the-art architectures, indicating the effectiveness of the proposed model in achieving fine-grained part segmentation, particularly in the context of quadruped animals.
引用
收藏
页码:149162 / 149172
页数:11
相关论文
共 20 条
[1]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[2]   Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts [J].
Chen, Xianjie ;
Mottaghi, Roozbeh ;
Liu, Xiaobai ;
Fidler, Sanja ;
Urtasun, Raquel ;
Yuille, Alan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1979-1986
[3]  
Fang HS, 2018, Arxiv, DOI arXiv:1805.04310
[4]   Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing [J].
Gong, Ke ;
Liang, Xiaodan ;
Zhang, Dongyu ;
Shen, Xiaohui ;
Lin, Liang .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6757-6765
[5]   Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images [J].
Hatamizadeh, Ali ;
Nath, Vishwesh ;
Tang, Yucheng ;
Yang, Dong ;
Roth, Holger R. ;
Xu, Daguang .
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 :272-284
[6]   TrSeg: Transformer for semantic segmentation [J].
Jin, Youngsaeng ;
Han, David ;
Ko, Hanseok .
PATTERN RECOGNITION LETTERS, 2021, 148 :29-35
[7]   Part Segmentation of Unseen Objects using Keypoint Guidance [J].
Naha, Shujon ;
Xiao, Qingyang ;
Banik, Prianka ;
Reza, Md Alimoor ;
Crandall, David J. .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, :1741-1749
[8]   Pose-Guided Knowledge Transfer for Object Part Segmentation [J].
Naha, Shujon ;
Xiao, Qingyang ;
Banik, Prianka ;
Reza, Md Alimoor ;
Crandall, David J. .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :3961-3965
[9]   Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation [J].
Nie, Xuecheng ;
Feng, Jiashi ;
Yan, Shuicheng .
COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 :519-534
[10]  
Ruan T, 2019, AAAI CONF ARTIF INTE, P4814