Learning fusion feature representation for garbage image classification model in human-robot interaction

被引：23

作者：

Li, Xi ^{[1
,2
]}

Li, Tian ^{[2
]}

Li, Shaoyi ^{[1
,2
]}

Tian, Bin ^{[1
,2
]}

Ju, Jianping ^{[1
]}

Liu, Tingting ^{[1
]}

Liu, Hai ^{[1
]}

机构：

[1] Nanchang Inst Sci & Technol, Sch Informat & Artificial Intelligence, Nanchang 330108, Peoples R China

[2] Wuhan Inst Technol, Sch Elect & Informat Engn, Wuhan 430075, Peoples R China

来源：

INFRARED PHYSICS & TECHNOLOGY | 2023年 / 128卷

关键词：

Infrared imaging; Image classification; Group convolution; Channel shuffle; CBAM; Label smoothing; NETWORK;

D O I：

10.1016/j.infrared.2022.104457

中图分类号：

TH7 [仪器、仪表];

学科分类号：

0804 ; 080401 ; 081102 ;

摘要：

Garbage image classification often suffers from three aspect challenges: complex image background, same-shape category, and low-quality image. The existing machine vision methods have excellent learning capabilities. However, they require powerful computational resources. In this work, an efficient garbage image classification network (GScbamKL-Net) is proposed in this work to address the problems mentioned. The proposed network is designed from the following three aspects. First, the new network unit with group convolution and channel shuffle is designed. This unit can significantly reduce the number of parameters of the model and achieve good performance. Second, the CBAM attention mechanism, which can extract key features by weighting the output features in space and channel, is added to the network unit. Furthermore, the LeakyReLu function is introduced as the activation function model. A label smoothing function is constructed as the loss function. It can mitigate the errors and effects of sample imbalance and obtain a good nonlinear transformation effect. The normal garbage images and garbage images using infrared imaging technology were tested respectively. Experimental results show that the proposed GScbam-Net has excellent classification performance while maintaining its lightweight.

引用

页数：8

共 38 条

[21] Multi-perspective social recommendation method with graph representation learning [J].

Liu, Hai ;

Zheng, Chao ;

Li, Duantengchuan ;

Zhang, Zhaoli ;

Lin, Ke ;

Shen, Xiaoxuan ;

Xiong, Neal N. ;

Wang, Jiazhang .

NEUROCOMPUTING, 2022, 468 :469-481

[22] Anisotropic angle distribution learning for head pose estimation and attention understanding in human-computer interaction [J].

Liu, Hai ;

Nie, Hanwen ;

Zhang, Zhaoli ;

Li, You-Fu .

NEUROCOMPUTING, 2021, 433 :310-322

[23] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design [J].

Ma, Ningning ;

Zhang, Xiangyu ;

Zheng, Hai-Tao ;

Sun, Jian .

COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :122-138

[24]

Mnih V, 2014, ADV NEUR IN, V27

[25]

Platt JL, 1998, INT CONGR SER, V1169, P21

[26]

Sakr GE, 2016, 2016 IEEE INTERNATIONAL MULTIDISCIPLINARY CONFERENCE ON ENGINEERING TECHNOLOGY (IMCET), P207, DOI 10.1109/IMCET.2016.7777453

[27]

Salimi I, 2018, 2018 INTERNATIONAL ELECTRONICS SYMPOSIUM ON KNOWLEDGE CREATION AND INTELLIGENT COMPUTING (IES-KCIC), P378, DOI 10.1109/KCIC.2018.8628499

[28] MobileNetV2: Inverted Residuals and Linear Bottlenecks [J].

Sandler, Mark ;

Howard, Andrew ;

Zhu, Menglong ;

Zhmoginov, Andrey ;

Chen, Liang-Chieh .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4510-4520

[29] A Novel Multi-Branch Channel Expansion Network for Garbage Image Classification [J].

Shi, Cuiping ;

Xia, Ruiyang ;

Wang, Liguo .

IEEE ACCESS, 2020, 8 :154436-154452

[30]

Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556

← 1 2 3 4 →