Using synthetic dataset for semantic segmentation of the human body in the problem of extracting anthropometric data

被引:0
作者
Absadyk, Azat [1 ]
Turar, Olzhas [1 ]
Akhmed-Zaki, Darkhan [1 ]
机构
[1] Astana IT Univ, Dept Sci & Innovat, Astana, Kazakhstan
来源
FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2024年 / 7卷
关键词
synthetic data; human segmentation; anthropometry; CNN; NVIDIA replicator; human body;
D O I
10.3389/frai.2024.1336320
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background The COVID-19 pandemic highlighted the need for accurate virtual sizing in e-commerce to reduce returns and waste. Existing methods for extracting anthropometric data from images have limitations. This study aims to develop a semantic segmentation model trained on synthetic data that can accurately determine body shape from real images, accounting for clothing.Methods A synthetic dataset of over 22,000 images was created using NVIDIA Omniverse Replicator, featuring human models in various poses, clothing, and environments. Popular CNN architectures (U-Net, SegNet, DeepLabV3, PSPNet) with different backbones were trained on this dataset for semantic segmentation. Models were evaluated on accuracy, precision, recall, and IoU metrics. The best performing model was tested on real human subjects and compared to actual measurements.Results U-Net with EfficientNet backbone showed the best performance, with 99.83% training accuracy and 0.977 IoU score. When tested on real images, it accurately segmented body shape while accounting for clothing. Comparison with actual measurements on 9 subjects showed average deviations of -0.24 cm for neck, -0.1 cm for shoulder, 1.15 cm for chest, -0.22 cm for thallium, and 0.17 cm for hip measurements.Discussion The synthetic dataset and trained models enable accurate extraction of anthropometric data from real images while accounting for clothing. This approach has significant potential for improving virtual fitting and reducing returns in e-commerce. Future work will focus on refining the algorithm, particularly for thallium and hip measurements which showed higher variability.
引用
收藏
页数:15
相关论文
共 42 条
[1]  
Absadyk A., 2020, Development of a Software Module 3d Digital User Profile Based on Anthropometric Data, P87
[2]   A method for measuring human body composition using digital images [J].
Affuso, Olivia ;
Pradhan, Ligaj ;
Zhang, Chengcui ;
Gao, Song ;
Wiener, Howard W. ;
Gower, Barbara ;
Heymsfield, Steven B. ;
Allison, David B. .
PLOS ONE, 2018, 13 (11)
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]  
Borkman S., 2021, Unity Perception: Generate Synthetic Data for Computer Vision
[5]  
Cao Z, 2017, 2017 IEEE C COMPUTER
[6]  
Chang WY, 2015, IEEE INT CON MULTI
[7]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]   Predicting body measures from 2D images using Convolutional Neural Networks [J].
de Souza, Joao W. M. ;
Holanda, Gabriel B. ;
Ivo, Roberto F. ;
Alves, Shara S. A. ;
da Silva, Suane P. P. ;
Nunes, Virginia X. ;
Loureiro, Luiz Lannes ;
Dias-Silva, C. H. ;
Reboucas Filho, Pedro P. .
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[10]  
Dibra E., 2016, 2016 Fourth International Conference on 3D Vision (3DV)