Multi-Level Segmentation Data Generation Based on a Scene-Specific Word Tree

被引：0

作者：

Kim, Soomin ^{[1
]}

Park, Juyoun ^{[1
]}

机构：

[1] Korea Inst Sci & Technol, Seoul 02792, South Korea

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Image segmentation; Semantics; Visualization; Semantic segmentation; Training; Image recognition; Data models; Segmentation; semantic grouping; language hierarchy; dataset generation; multi-level analysis;

D O I：

10.1109/ACCESS.2024.3418515

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We, humans, perceive the scene utilizing pre-learned language categories. Our vocabulary system inherently possesses a hierarchy, aiding humans in understanding scenes at multiple levels. For example, when a person passes by chairs and desks from a distance rather than interacting with them up close, the objects are perceived from a broader perspective and recognized as furniture at a higher category level. In this work, we propose a multi-level semantic segmentation data generation method based on a scene-specific word tree to mimic human multi-level scene recognition. Multi-level semantic segmentation data encompasses diverse levels of grouped segmented areas with different degrees of detail, from the finest level of conventional semantic segmentation to coarser levels. Our scene-specific word trees leverage linguistic hierarchies to group scene components by considering relationships between words present in the scene. Furthermore, in the proposed data generation method, each word tree is constructed within a single image, allowing us to group the objects into user-selected levels, taking into account the relative relationship between objects in that scene. We demonstrate the effectiveness of our data generation method by building a multi-level scene segmentation network and training the model with the generated dataset, which reflects the scene-specific word tree.

引用

页码：88202 / 88215

页数：14

共 43 条

[1] Avetisyan A, 2024, Arxiv, DOI arXiv:2403.13064
[2] Bell S, 2015, PROC CVPR IEEE, P3479, DOI 10.1109/CVPR.2015.7298970
[3] OPENSURFACES: A Richly Annotated Catalog of Surface Appearance
Bell, Sean
Upchurch, Paul
Snavely, Noah
Bala, Kavita
[J]. ACM TRANSACTIONS ON GRAPHICS, 2013, 32 (04):
[4] Bertinetto L., 2020, P IEEE CVF C COMP VI
[5] An Object-Level High-Order Contextual Descriptor Based on Semantic, Spatial, and Scale Cues
Cao, Xiaochun
Wei, Xingxing
Han, Yahong
Chen, Xiaowu
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (07) : 1327 - 1339
[6] Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts
Chen, Xianjie
Mottaghi, Roozbeh
Liu, Xiaobai
Fidler, Sanja
Urtasun, Raquel
Yuille, Alan
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1979 - 1986
[7] Masked-attention Mask Transformer for Universal Image Segmentation
Cheng, Bowen
Misra, Ishan
Schwing, Alexander G.
Kirillov, Alexander
Girdhar, Rohit
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1280 - 1289
[8] Describing Textures in the Wild
Cimpoi, Mircea
Maji, Subhransu
Kokkinos, Iasonas
Mohamed, Sammy
Vedaldi, Andrea
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3606 - 3613
[9] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[10] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 →