An Improved SSD-Like Deep Network-Based Object Detection Method for Indoor Scenes

被引:38
作者
Ni, Jianjun [1 ]
Shen, Kang [1 ]
Chen, Yan [1 ]
Yang, Simon X. [2 ]
机构
[1] Hohai Univ, Coll Internet Things Engn, Changzhou 213022, Jiangsu, Peoples R China
[2] Univ Guelph, Sch Engn, Adv Robot & Intelligent Syst ARIS Lab, Guelph, ON N1G 2W1, Canada
基金
中国国家自然科学基金;
关键词
Object detection; Feature extraction; Robots; Deep learning; Task analysis; Lighting; Data mining; Deep network; indoor scene; object detection; ResNet50; network; single-shot multibox detector (SSD) algorithm; RECOGNITION; NAVIGATION;
D O I
10.1109/TIM.2023.3244819
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The indoor scene object detection technology is of important research significance, which is one of the popular research topics in the field of scene understanding for indoor robots. In recent years, the solutions based on deep learning have achieved good results in object detection. However, there are still some problems to be further studied in indoor object detection methods, such as lighting problem and occlusion problem caused by the complexity of the indoor environment. Aiming at these problems, an improved object detection method based on deep neural networks is proposed in this article, which uses a framework similar to the single-shot multibox detector (SSD). In the proposed method, an improved ResNet50 network is used to enhance the transmission of information, and the feature expression capability of the feature extraction network is improved. At the same time, a multiscale contextual information extraction (MCIE) module is used to extract the contextual information of the indoor scene, so as to improve the indoor object detection effect. In addition, an improved dual-threshold non-maximum suppression (DT-NMS) algorithm is used to alleviate the occlusion problem in indoor scenes. Finally, the public dataset SUN2012 is further screened for the special application of indoor scene object detection, and the proposed method is tested on this dataset. The experimental results show that the mean average precision (mAP) of the proposed method can reach 54.10%, which is higher than those of the state-of-the-art methods.
引用
收藏
页数:15
相关论文
共 69 条
  • [21] Effective Fusion Factor in FPN for Tiny Object Detection
    Gong, Yuqi
    Yu, Xuehui
    Ding, Yao
    Peng, Xiaoke
    Zhao, Jian
    Han, Zhenjun
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1159 - 1167
  • [22] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [23] GeoRec: Geometry-enhanced semantic 3D reconstruction of RGB-D indoor scenes
    Huan, Linxi
    Zheng, Xianwei
    Gong, Jianya
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2022, 186 : 301 - 314
  • [24] Multi-Scale Feature Fusion Convolutional Neural Network for Indoor Small Target Detection
    Huang, Li
    Chen, Cheng
    Yun, Juntong
    Sun, Ying
    Tian, Jinrong
    Hao, Zhiqiang
    Yu, Hui
    Ma, Hongjie
    [J]. FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [25] Multi-Modal Sensor Fusion-Based Deep Neural Network for End-to-End Autonomous Driving With Scene Understanding
    Huang, Zhiyu
    Lv, Chen
    Xing, Yang
    Wu, Jingda
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (10) : 11781 - 11790
  • [26] Ioffe S, 2015, PR MACH LEARN RES, V37, P448
  • [27] Research on Indoor Scene Classification Mechanism Based on Multiple Descriptors Fusion
    Ji, Ping
    Qin, Danyang
    Feng, Pan
    Lan, Tingting
    Sun, Guanyu
    [J]. MOBILE INFORMATION SYSTEMS, 2020, 2020
  • [28] A HIERARCHICAL INFERENTIAL METHOD FOR INDOOR SCENE CLASSIFICATION
    Jiang, Jingzhe
    Liu, Peng
    Ye, Zhipeng
    Zhao, Wei
    Tang, Xianglong
    [J]. INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2017, 27 (04) : 839 - 852
  • [29] High-quality indoor scene 3D reconstruction with RGB-D cameras: A brief review
    Li, Jianwei
    Gao, Wei
    Wu, Yihong
    Liu, Yangdong
    Shen, Yanfei
    [J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (03) : 369 - 393
  • [30] MAPNet: Multi-modal attentive pooling network for RGB-D indoor scene classification
    Li, Yabei
    Zhang, Zhang
    Cheng, Yanhua
    Wang, Liang
    Tan, Tieniu
    [J]. PATTERN RECOGNITION, 2019, 90 : 436 - 449