SMS3D: 3D Synthetic Mushroom Scenes Dataset for 3D Object Detection and Pose Estimation

被引:0
作者
Zakeri, Abdollah [1 ]
Koirala, Bikram [2 ]
Kang, Jiming [2 ]
Balan, Venkatesh [2 ]
Zhu, Weihang [2 ]
Benhaddou, Driss [2 ]
Merchant, Fatima A. [1 ,2 ]
机构
[1] Univ Houston, Dept Comp Sci, Houston, TX 77004 USA
[2] Univ Houston, Dept Engn Technol, Houston, TX 77004 USA
基金
美国农业部;
关键词
synthetic dataset; mushroom detection; mushroom pose estimation; point cloud; deep learning; computer vision; mushroom harvesting automation; object detection; instance segmentation;
D O I
10.3390/computers14040128; 10.3390/computers14040128
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The mushroom farming industry struggles to automate harvesting due to limited large-scale annotated datasets and the complex growth patterns of mushrooms, which complicate detection, segmentation, and pose estimation. To address this, we introduce a synthetic dataset with 40,000 unique scenes of white Agaricus bisporus and brown baby bella mushrooms, capturing realistic variations in quantity, position, orientation, and growth stages. Our two-stage pose estimation pipeline combines 2D object detection and instance segmentation with a 3D point cloud-based pose estimation network using a Point Transformer. By employing a continuous 6D rotation representation and a geodesic loss, our method ensures precise rotation predictions. Experiments show that processing point clouds with 1024 points and the 6D Gram-Schmidt rotation representation yields optimal results, achieving an average rotational error of 1.67 degrees on synthetic data, surpassing current state-of-the-art methods in mushroom pose estimation. The model, further, generalizes well to real-world data, attaining a mean angle difference of 3.68 degrees on a subset of the M18K dataset with ground-truth annotations. This approach aims to drive automation in harvesting, growth monitoring, and quality assessment in the mushroom industry.
引用
收藏
页数:19
相关论文
共 27 条
[1]  
An Lin, 2021, 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), P1174, DOI 10.1109/IAEAC50856.2021.9390669
[2]  
Anagnostopoulou D., 2023, P IEEECVF C COMPUTER, P6282, DOI DOI 10.1109/CVPRW59228.2023.00668
[3]  
[Anonymous], 2024, 48 MECH ROB C MR P I, DOI [10.1115/DETC2024-143056, DOI 10.1115/DETC2024-143056]
[4]  
Baisa N.L., 2022, ARXIV
[5]  
Chen W, 2020, IEEE WINT CONF APPL, P2813, DOI [10.1109/wacv45572.2020.9093272, 10.1109/WACV45572.2020.9093272]
[6]   Generating Diverse Agricultural Data for Vision-Based Farming Applications [J].
Cieslak, Mikolaj ;
Govindarajan, Umabharathi ;
Garcia, Alejandro ;
Chandrashekar, Anuradha ;
Haedrich, Torsten ;
Mendoza-Drosik, Aleksander ;
Michels, Dominik L. ;
Pirk, Soeren ;
Fu, Chia-Chun ;
Palubicki, Wojciech .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2024, :5422-5431
[7]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[8]  
Dang Z., 2022, Proceedings of the Computer VisionECCV 2022, DOI [10.1007/978-3-031-19769-72, DOI 10.1007/978-3-031-19769-72]
[9]  
Gao G, 2020, IEEE INT CONF ROBOT, P3643, DOI [10.1109/ICRA40945.2020.9197461, 10.1109/icra40945.2020.9197461]
[10]  
He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]