SMS3D: 3D Synthetic Mushroom Scenes Dataset for 3D Object Detection and Pose Estimation

被引：0

作者：

Zakeri, Abdollah ^{[1
]}

Koirala, Bikram ^{[2
]}

Kang, Jiming ^{[2
]}

Balan, Venkatesh ^{[2
]}

Zhu, Weihang ^{[2
]}

Benhaddou, Driss ^{[2
]}

Merchant, Fatima A. ^{[1
,2
]}

机构：

[1] Univ Houston, Dept Comp Sci, Houston, TX 77004 USA

[2] Univ Houston, Dept Engn Technol, Houston, TX 77004 USA

来源：

COMPUTERS | 2025年 / 14卷 / 04期

基金：

美国农业部;

关键词：

synthetic dataset; mushroom detection; mushroom pose estimation; point cloud; deep learning; computer vision; mushroom harvesting automation; object detection; instance segmentation;

D O I：

10.3390/computers14040128; 10.3390/computers14040128

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The mushroom farming industry struggles to automate harvesting due to limited large-scale annotated datasets and the complex growth patterns of mushrooms, which complicate detection, segmentation, and pose estimation. To address this, we introduce a synthetic dataset with 40,000 unique scenes of white Agaricus bisporus and brown baby bella mushrooms, capturing realistic variations in quantity, position, orientation, and growth stages. Our two-stage pose estimation pipeline combines 2D object detection and instance segmentation with a 3D point cloud-based pose estimation network using a Point Transformer. By employing a continuous 6D rotation representation and a geodesic loss, our method ensures precise rotation predictions. Experiments show that processing point clouds with 1024 points and the 6D Gram-Schmidt rotation representation yields optimal results, achieving an average rotational error of 1.67 degrees on synthetic data, surpassing current state-of-the-art methods in mushroom pose estimation. The model, further, generalizes well to real-world data, attaining a mean angle difference of 3.68 degrees on a subset of the M18K dataset with ground-truth annotations. This approach aims to drive automation in harvesting, growth monitoring, and quality assessment in the mushroom industry.

引用

页数：19

共 27 条

[1]

An Lin, 2021, 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), P1174, DOI 10.1109/IAEAC50856.2021.9390669

[2]

Anagnostopoulou D., 2023, P IEEECVF C COMPUTER, P6282, DOI DOI 10.1109/CVPRW59228.2023.00668

[3]

[Anonymous], 2024, 48 MECH ROB C MR P I, DOI [10.1115/DETC2024-143056, DOI 10.1115/DETC2024-143056]

[4]

Baisa N.L., 2022, ARXIV

[5]

Chen W, 2020, IEEE WINT CONF APPL, P2813, DOI [10.1109/wacv45572.2020.9093272, 10.1109/WACV45572.2020.9093272]

[6] Generating Diverse Agricultural Data for Vision-Based Farming Applications [J].

Cieslak, Mikolaj ;

Govindarajan, Umabharathi ;

Garcia, Alejandro ;

Chandrashekar, Anuradha ;

Haedrich, Torsten ;

Mendoza-Drosik, Aleksander ;

Michels, Dominik L. ;

Pirk, Soeren ;

Fu, Chia-Chun ;

Palubicki, Wojciech .

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2024, :5422-5431

[7] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[8]

Dang Z., 2022, Proceedings of the Computer VisionECCV 2022, DOI [10.1007/978-3-031-19769-72, DOI 10.1007/978-3-031-19769-72]

[9]

Gao G, 2020, IEEE INT CONF ROBOT, P3643, DOI [10.1109/ICRA40945.2020.9197461, 10.1109/icra40945.2020.9197461]

[10]

He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

← 1 2 3 →