Mix3D: Out-of-Context Data Augmentation for 3D Scenes

被引：77

作者：

Nekrasov, Alexey ^{[1
]}

Schult, Jonas ^{[1
]}

Litany, Or ^{[2
]}

Leibe, Bastian ^{[1
]}

Engelmann, Francis ^{[1
,3
]}

机构：

[1] Rhein Westfal TH Aachen, Aachen, Germany

[2] NVIDIA, Sunnyvale, CA USA

[3] ETH AI Ctr, Zurich, Switzerland

来源：

2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021) | 2021年

关键词：

NETWORKS;

D O I：

10.1109/3DV53792.2021.00022

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present Mix3D, a data augmentation technique for segmenting large-scale 3D scenes. Since scene context helps reasoning about object semantics, current works focus on models with large capacity and receptive fields that can fully capture the global context of an input 3D scene. However, strong contextual priors can have detrimental implications like mistaking a pedestrian crossing the street for a car. In this work, we focus on the importance of balancing global scene context and local geometry, with the goal of generalizing beyond the contextual priors in the training set. In particular, we propose a "mixing" technique which creates new training samples by combining two augmented scenes. By doing so, object instances are implicitly placed into novel out-of-context environments, therefore making it harder for models to rely on scene context alone, and instead infer semantics from local structure as well. We perform detailed analysis to understand the importance of global context, local structures and the effect of mixing scenes. In experiments, we show that models trained with Mix3D profit from a significant performance boost on indoor (ScanNet, S3DIS) and outdoor datasets (SemanticKITTI). Mix3D can be trivially used with any existing method, e.g., trained with Mix3D, MinkowskiNet outperforms all prior state-of-the-art methods by a significant margin on the ScanNet test benchmark (78.1% mIoU). Code is available at: https://nekrasov.dev/mix3d/

引用

页码：116 / 125

页数：10

共 50 条

[31] Energy-Efficient Resource Allocation with 3D Beamforming in 3D MIMO-OFDMA Systems
Li, Zhe
Chen, Yueyun
Mai, Zhiyuan
PROCEEDINGS OF 2016 8TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2016), 2016, : 370 - 374
[32] 3D-Pruning: A Model Compression Framework for Efficient 3D Action Recognition
Guo, Jinyang
Liu, Jiaheng
Xu, Dong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8717 - 8729
[33] A Review on Deep Learning Approaches for 3D Data Representations in Retrieval and Classifications
Gezawa, Abubakar Sulaiman
Zhang, Yan
Wang, Qicong
Yunqi, Lei
IEEE ACCESS, 2020, 8 : 57566 - 57593
[34] Unsupervised contrastive learning with simple transformation for 3D point cloud data
Jiang, Jincen
Lu, Xuequan
Ouyang, Wanli
Wang, Meili
VISUAL COMPUTER, 2024, 40 (08) : 5169 - 5186
[35] 3D Scanning Technology Bridging Microcircuits and Macroscale Brain Images in 3D Novel Embedding Overlapping Protocol
Ide, Saya
Kajiwara, Motoki
Imai, Hirohiko
Shimono, Masanori
JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2019, (147):
[36] N3: Addressing and Routing in 3D Nanonetworks
Tsioliaridou, Ageliki
Liaskos, Christos
Pachis, Lefteris
Ioannidis, Sotiris
Pitsillides, Andreas
2016 23RD INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS (ICT), 2016,
[37] Fabrication of truly 3D microfluidic channel using 3D-printed soluble mold
Kang, Kyunghun
Oh, Sangwoo
Yi, Hak
Han, Seungoh
Hwang, Yongha
BIOMICROFLUIDICS, 2018, 12 (01):
[38] 3D printed microfluidics for biological applications
Ho, Chee Meng Benjamin
Sum Huan Ng
Li, King Ho Holden
Yoon, Yong-Jin
LAB ON A CHIP, 2015, 15 (18) : 3627 - 3637
[39] BIODIVERSITY AND VULNERABILITY IN A 3D MUTUALISTIC SYSTEM
Guerrero, Giovanny
Antonio Langa, Jose
Suarez, Antonio
DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS, 2014, 34 (10) : 4107 - 4126
[40] Learning by Restoring Broken 3D Geometry
Liu, Jinxian
Ni, Bingbing
Chen, Ye
Yu, Zhenbo
Wang, Hang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 11024 - 11039

← 1 2 3 4 5 →