Few-Shot Semantic Segmentation for Consumer Electronics: An Inter-Class Relation Mining Approach

被引:4
作者
Huang, Huafei [1 ]
Yuan, Xu [1 ]
Yu, Shuo [2 ,3 ]
Zhao, Wenhong [4 ]
Alfarraj, Osama [5 ]
Tolba, Amr [5 ]
Xia, Feng [6 ]
机构
[1] Dalian Univ Technol, Sch Software, Dalian 116620, Peoples R China
[2] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116024, Peoples R China
[3] Dalian Univ Technol, Key Lab Social Comp & Cognit Intelligence, Minist Educ, Dalian 116024, Peoples R China
[4] Zhejiang Univ Technol, Ultraprecis Machining Ctr, Hangzhou 310014, Peoples R China
[5] King Saud Univ, Community Coll, Comp Sci Dept, Riyadh 11437, Saudi Arabia
[6] RMIT Univ, Sch Comp Technol, Melbourne, Vic 3000, Australia
关键词
Semantic segmentation; Interference; Feature extraction; Semantics; Consumer electronics; Computational modeling; Predictive models; image processing; diffusion model; few-shot learning; relation mining; NETWORK;
D O I
10.1109/TCE.2024.3373630
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Few-shot semantic segmentation (FSS), which can perform segmentation using only a limited number of annotated examples, is a promising technique that has been embedded in many electronic products. Existing approaches usually achieve segmentation for the query image by computing the similarity between the support and query images. However, when segmenting a new query image, the model prediction may be interfered with by distinct classes with similar semantic information, leading to unsatisfactory results. This may greatly weaken the generalization of FSS in real-world scenarios. In response to this challenge, we propose a few-shot semantic segmentation model based on inter-class relation mining named IRMNet. Firstly, we devise a class filter module that accurately selects useful semantic information by mining the class relations between the query and support images. Then, we use a class generation module that applies a diffusion model to generate rough segmentation masks for query images to augment supervision signals. Finally, we conduct extensive experiments on the PASCAL- 5(i) and FSS-1000 datasets. The evaluation results show that IRMNet can achieve superior performance compared to other baselines. The advancement of FSS in this work can contribute to enhancing visual intelligence in real-world consumer electronics.
引用
收藏
页码:3709 / 3721
页数:13
相关论文
共 43 条
[1]  
Amit T, 2022, Arxiv, DOI [arXiv:2112.00390, 10.48550/arXiv.2112.00390]
[2]  
Austin J, 2021, ADV NEUR IN
[3]   Multiple Kernel Based Transfer Learning for the Few-Shot Recognition Task in Smart Home Scene [J].
Chang, S. C. ;
Zhao, C. H. .
IFAC PAPERSONLINE, 2020, 53 (02) :17101-17106
[4]  
Chen N., 2021, P INT C LEARN REPR
[5]   DUAL-ATTENTION NETWORK FOR FEW-SHOT SEGMENTATION [J].
Chen, Zhikui ;
Wang, Han ;
Zhang, Suhua ;
Zhong, Fangming .
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :2210-2214
[6]  
Dhariwal P, 2021, ADV NEUR IN, V34
[7]  
Dong N., 2018, BMVC, V3, P4
[8]   Unsupervised Segmentation of Smart Home Logs for Human Habit Discovery [J].
Esposito, Lucia ;
Leotta, Francesco ;
Mecella, Massimo ;
Veneruso, Silvestro .
2022 18TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS (IE), 2022,
[9]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[10]  
Fan WC, 2023, AAAI CONF ARTIF INTE, P579