Lightweight multi-attribute target detection for dogs and cats based on improved YOLOv7

被引：0

作者：

Cao, Danyang ^{[1
]}

Liu, Fangfang ^{[1
]}

机构：

[1] North China Univ Technol, Sch Informat & Technol, Beijing 100144, Peoples R China

来源：

PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CYBER SECURITY, ARTIFICIAL INTELLIGENCE AND DIGITAL ECONOMY, CSAIDE 2024 | 2024年

关键词：

Fine-grained image classification; Target detection; Dog and cat multi-attribute recognition; YOLOv7; PConv; B-CNN;

D O I：

10.1145/3672919.3672992

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-attribute target detection techniques can accurately obtain the location and attribute information of cats and dogs, thus providing a more effective means of pet management. However, it is difficult to quickly locate dogs and cats in complex environments and accurately identify their fine-grained attributes. To address these issues, this research introduces a streamlined and efficient method for multi-attribute target detection, which firstly uses a lightweight YOLOv7 network (YOLOv7-P) to quickly determine the location of cats and dogs, and excludes the influence of background and other redundant information on classification, and then uses a B-CNN (Bi-Linear Convolutional Neural Network, B-CNN) network to identify the fine-grained attributes of cats and dogs, and improve the cat and dog attribute classification accuracy. The YOLOv7-P network simplifies its structure by utilizing the PConv (partial convolution) module instead of the traditional 3x3 convolution block in the YOLOv7 ELAN structure. Additionally, it incorporates a hierarchical adaptive channel pruning technique that identifies and removes unimportant filters within the most redundant layers, thereby minimizing redundant computations and memory accesses, ultimately enhancing the speed of target detection. In comparison to the original YOLOv7 model, the YOLOv7-P network boasts a significant reduction in parameters by 17.02%, a decrease in GFLOPS by 26.44%, and an impressive 9.50% improvement in FPS. This optimization not only simplifies the network structure but also enhances its efficiency in target detection tasks. The target region localised by YOLOv7-P was fed into the B-CNN network for training and classification, resulting in 98.92% and 92.08% classification accuracy for cat and dog species and breeds in The Oxford-IIIT Pet Dataset dataset, respectively. In essence, the multi-attribute target detection method introduced in this study demonstrates swift and precise capabilities in detecting cats and dogs with multiple attributes in intricate environments. This approach holds considerable practical value and offers broad application prospects.

引用

页码：395 / 400

页数：6

共 11 条

[1] Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Chen, Jierun
Kao, Shiu-Hong
He, Hao
Zhuo, Weipeng
Wen, Song
Lee, Chul-Ho
Chan, S. -H. Gary
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12021 - 12031
[2] He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[3] Li S., 2022, 2022 IEEE INT C DEP, P1, DOI [10.1109/DASC/Pi- Com/CBDCom/Cy55231.2022.9927801, DOI 10.1109/DASC/PI-COM/CBDCOM/CY55231.2022.9927801]
[4] w Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition
Lin, Tsung-Yu
RoyChowdhury, Aruni
Maji, Subhransu
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (06) : 1309 - 1322
[5] Shen Q., 2023, 2023 IEEE INT C SENS, P595, DOI [10.1109/ICSECE58870.2023.10263533, DOI 10.1109/ICSECE58870.2023.10263533]
[6] COFENET: CO-FEATURE NEURAL NETWORK MODEL FOR FINE-GRAINED IMAGE CLASSIFICATION
Wang, Bor-Shiun
Hsieh, Jun-Wei
Hsieh, Yi-Kuan
Chen, Ping-Yang
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3928 - 3932
[7] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Wang, Chien-Yao
Bochkovskiy, Alexey
Liao, Hong-Yuan Mark
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7464 - 7475
[8] Convolutional Neural Network Pruning with Structural Redundancy Reduction
Wang, Zi
Li, Chengcheng
Wang, Xiangyang
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14908 - 14917
[9] Xu Xuetian, 2023, 2022 International Conference on Image Processing and Computer Vision (IPCV), P56, DOI 10.1109/IPCV57033.2023.00017
[10] Zheng Huang, 2021, 2021 11th International Conference on Information Technology in Medicine and Education (ITME), P191, DOI 10.1109/ITME53901.2021.00047

← 1 2 →