A semantic segmentation algorithm for fashion images based on modified mask RCNN

被引：3

作者：

He, Wentao ^{[1
]}

Wang, Jing'an ^{[1
]}

Wang, Lei ^{[1
]}

Pan, Ruru ^{[1
]}

Gao, Weidong ^{[1
]}

机构：

[1] Jiangnan Univ, Key Lab Ecotext, Minist Educ, 1800 Lihu Ave, Wuxi 214122, Jiangsu, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 82卷 / 18期

基金：

中国国家自然科学基金;

关键词：

Clothes; Semantic segmentation; Mask RCNN; Feature pyramid network; Detection accuracy;

D O I：

10.1007/s11042-023-14958-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The semantic segmentation of human body images has huge application potential in many fields, such as autonomous driving, artificial intelligence (AI) face changing, and virtual try-on. Nowadays, many researchers use additional human body posture information to generate multi-level human body analysis images. However, the existing method has limitations when faced with multiple poses and overlapping targets. In this paper, a novel algorithm based on Mask RCNN which has pixel-level accuracy is proposed. In the feature extraction process, a multi-scale feature fusion module applying dilated convolution is proposed to obtain richer semantic information from different perceptual fields. We added a small residual module to the original residual unit structure to increase the size of the receptive field of each layer to capture details and global characteristics. Three convolution kernels with different ratios are designed to obtain receptive fields of different scales. The experimental results show that our method has better performance while considering both object positioning and target classification.

引用

页码：28427 / 28444

页数：18

共 25 条

[1] FRED-Net: Fully residual encoder-decoder network for accurate iris segmentation [J].