GL-YOLO-Lite: A Novel Lightweight Fallen Person Detection Model

被引：10

作者：

Dai, Yuan ^{[1
]}

Liu, Weiming ^{[1
]}

机构：

[1] South China Univ Technol, Sch Civil Engn & Transportat, Guangzhou 510641, Peoples R China

来源：

ENTROPY | 2023年 / 25卷 / 04期

关键词：

fallen person detection; deep learning; computer vision; object detection; lightweight neural networks; binary cross-entropy;

D O I：

10.3390/e25040587

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

The detection of a fallen person (FPD) is a crucial task in guaranteeing individual safety. Although deep-learning models have shown potential in addressing this challenge, they face several obstacles, such as the inadequate utilization of global contextual information, poor feature extraction, and substantial computational requirements. These limitations have led to low detection accuracy, poor generalization, and slow inference speeds. To overcome these challenges, the present study proposed a new lightweight detection model named Global and Local You-Only-Look-Once Lite (GL-YOLO-Lite), which integrates both global and local contextual information by incorporating transformer and attention modules into the popular object-detection framework YOLOv5. Specifically, a stem module replaced the original inefficient focus module, and rep modules with re-parameterization technology were introduced. Furthermore, a lightweight detection head was developed to reduce the number of redundant channels in the model. Finally, we constructed a large-scale, well-formatted FPD dataset (FPDD). The proposed model employed a binary cross-entropy (BCE) function to calculate the classification and confidence losses. An experimental evaluation of the FPDD and Pascal VOC dataset demonstrated that GL-YOLO-Lite outperformed other state-of-the-art models with significant margins, achieving 2.4-18.9 mean average precision (mAP) on FPDD and 1.8-23.3 on the Pascal VOC dataset. Moreover, GL-YOLO-Lite maintained a real-time processing speed of 56.82 frames per second (FPS) on a Titan Xp and 16.45 FPS on a HiSilicon Kirin 980, demonstrating its effectiveness in real-world scenarios.

引用

页数：20

共 63 条

[1]

[Anonymous], 2015, labelImg

[2]

[Anonymous], 2018, Ageing and Health

[3]

Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027

[4]

Bilski P, 2015, INT WORKSH INT DATA, P733, DOI 10.1109/IDAACS.2015.7341400

[5] Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications [J].

Cai, Han ;

Lin, Ji ;

Lin, Yujun ;

Liu, Zhijian ;

Tang, Haotian ;

Wang, Hanrui ;

Zhu, Ligeng ;

Han, Song .

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2022, 27 (03)

[6] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].

Cao, Zhe ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310

[7] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[8]

Chen, 2021, Zenodo, DOI 10.5281/ZENODO.5241425

[9] Contextualizing Object Detection and Classification [J].

Chen, Qiang ;

Song, Zheng ;

Dong, Jian ;

Huang, Zhongyang ;

Hua, Yang ;

Yan, Shuicheng .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (01) :13-27

[10] Xception: Deep Learning with Depthwise Separable Convolutions [J].

Chollet, Francois .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807

← 1 2 3 4 5 6 7 →