YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

被引:6066
作者
Wang, Chien-Yao [1 ]
Bochkovskiy, Alexey
Liao, Hong-Yuan Mark [1 ]
机构
[1] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
关键词
D O I
10.1109/CVPR52729.2023.00721
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-time object detection is one of the most important research topics in computer vision. As new approaches regarding architecture optimization and training optimization are continually being developed, we have found two research topics that have spawned when dealing with these latest state-of-the-art methods. To address the topics, we propose a trainable bag-of-freebies oriented solution. We combine the flexible and efficient training tools with the proposed architecture and the compound scaling method. YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 120 FPS and has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100. Source code is released in https://github.com/WongKinYiu/yolov7.
引用
收藏
页码:7464 / 7475
页数:12
相关论文
共 97 条
[11]  
Ding X., 2022, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
[12]   Diverse Branch Block: Building a Convolution as an Inception-like Unit [J].
Ding, Xiaohan ;
Zhang, Xiangyu ;
Han, Jungong ;
Ding, Guiguang .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :10881-10890
[13]   RepVGG: Making VGG-style ConvNets Great Again [J].
Ding, Xiaohan ;
Zhang, Xiangyu ;
Ma, Ningning ;
Han, Jungong ;
Ding, Guiguang ;
Sun, Jian .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13728-13737
[14]   ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks [J].
Ding, Xiaohan ;
Guo, Yuchen ;
Ding, Guiguang ;
Han, Jungong .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1911-1920
[15]  
Ding Xiaohan, 2023, INT C LEARN REPR ICL
[16]   Fast and Accurate Model Scaling [J].
Dollar, Piotr ;
Singh, Mannat ;
Girshick, Ross .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :924-932
[17]  
Du Xianzhi, 2021, ARXIV210700057
[18]   TOOD: Task-aligned One-stage Object Detection [J].
Feng, Chengjian ;
Zhong, Yujie ;
Gao, Yu ;
Scott, Matthew R. ;
Huang, Weilin .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3490-3499
[19]   Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges [J].
Feng, Di ;
Haase-Schutz, Christian ;
Rosenbaum, Lars ;
Hertlein, Heinz ;
Glaser, Claudius ;
Timm, Fabian ;
Wiesbeck, Werner ;
Dietmayer, Klaus .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (03) :1341-1360
[20]  
Garipov T, 2018, ADV NEUR IN, V31