Improving Performance of Real-Time Object Detection in Edge Device Through Concurrent Multi-Frame Processing

被引:1
作者
Kim, Seunghwan [1 ]
Kim, Changjong [1 ]
Kim, Sunggon [1 ]
机构
[1] Seoul Natl Univ Sci & Technol, Dept Comp Sci & Engn, Seoul 01811, South Korea
关键词
YOLO; Image edge detection; Performance evaluation; Computational modeling; Streaming media; Real-time systems; Load modeling; Runtime; Parallel processing; Memory management; Edge devices; machine learning; object detection; performance optimization;
D O I
10.1109/ACCESS.2024.3520240
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the performance and accuracy of machine learning and AI algorithms improve, the demand for adopting computer vision techniques to solve various problems, such as autonomous driving and AI robots, increases. To meet such demand, IoT and edge devices, which are small enough to be adopted in various environments while having sufficient computing capabilities, are being widely adopted. However, as devices are utilized in IoT and edge environments, which have harsh restrictions compared to traditional server environments, they are often limited by low computational and memory resources, in addition to the limited electrical power supply. This necessitates a unique approach for small IoT devices that are required to run complex tasks. In this paper, we propose a concurrent multi-frame processing scheme for real-time object detection algorithms. To do this, we first divide the video into individual frames and group the frames according to the number of cores in the device. Then, we allocate a group of frames per core to perform the object detection, resulting in parallel detection of multiple frames. We implement our scheme in YOLO (You Only Look Once), one of the most popular real-time object detection algorithms, on a state-of-the-art, resource-constrained IoT edge device, Nvidia Jetson Orin Nano, using real-world video and image datasets, including MS-COCO, ImageNet, PascalVOC, DOTA, animal videos, and car-traffic videos. Our evaluation results show that our proposed scheme can improve the diverse aspect of edge performance and improve the runtime, memory consumption, and power usage by up to 445%, 69%, and 73%, respectively. Additionally, it demonstrates improvements of $2.10\times $ over state-of-the-art model optimization.
引用
收藏
页码:1522 / 1533
页数:12
相关论文
共 50 条
[1]   Optimizing DNN training with pipeline model parallelism for enhanced performance in embedded systems [J].
Al Maruf, Md ;
Azim, Akramul ;
Auluck, Nitin ;
Sahi, Mansi .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 190
[2]   Vision-based navigation and guidance for agricultural autonomous vehicles and robots: A review [J].
Bai, Yuhao ;
Zhang, Baohua ;
Xu, Naimin ;
Zhou, Jun ;
Shi, Jiayou ;
Diao, Zhihua .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 205
[3]   Update Compression for Deep Neural Networks on the Edge [J].
Chen, Bo ;
Bakhshi, Ali ;
Batista, Gustavo ;
Ng, Brian ;
Chin, Tat-Jun .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :3075-3085
[4]  
Chen C, 2023, IEEE T INTELL TRANSP, V24, P13023, DOI [10.1162/imag_a_00006, 10.1109/TITS.2022.3232153, 10.1162/imag_a_00006]
[5]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[7]  
Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, 10.48550/arXiv.1704.04861, DOI 10.48550/ARXIV.1704.04861]
[8]  
Gao D., 2024, IEEE Geosci.Remote Sens. Lett., V21
[9]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[10]   A survey of deep learning techniques for autonomous driving [J].
Grigorescu, Sorin ;
Trasnea, Bogdan ;
Cocias, Tiberiu ;
Macesanu, Gigel .
JOURNAL OF FIELD ROBOTICS, 2020, 37 (03) :362-386