GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild

被引:1158
作者
Huang, Lianghua [1 ,2 ]
Zhao, Xin [1 ,2 ]
Huang, Kaiqi [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Ctr Res Intelligent Syst & Engn, Inst Automat, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[4] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Object tracking; Databases; Protocols; Benchmark testing; Servers; benchmark dataset; performance evaluation;
D O I
10.1109/TPAMI.2019.2957464
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce here a large tracking database that offers an unprecedentedly wide coverage of common moving objects in the wild, called GOT-10k. Specifically, GOT-10k is built upon the backbone of WordNet structure [1] and it populates the majority of over 560 classes of moving objects and 87 motion patterns, magnitudes wider than the most recent similar-scale counterparts [19], [20], [23], [26]. By releasing the large high-diversity database, we aim to provide a unified training and evaluation platform for the development of class-agnostic, generic purposed short-term trackers. The features of GOT-10k and the contributions of this article are summarized in the following. (1) GOT-10k offers over 10,000 video segments with more than 1.5 million manually labeled bounding boxes, enabling unified training and stable evaluation of deep trackers. (2) GOT-10k is by far the first video trajectory dataset that uses the semantic hierarchy of WordNet to guide class population, which ensures a comprehensive and relatively unbiased coverage of diverse moving objects. (3) For the first time, GOT-10k introduces the one-shot protocol for tracker evaluation, where the training and test classes are zero-overlapped. The protocol avoids biased evaluation results towards familiar objects and it promotes generalization in tracker development. (4) GOT-10k offers additional labels such as motion classes and object visible ratios, facilitating the development of motion-aware and occlusion-aware trackers. (5) We conduct extensive tracking experiments with 39 typical tracking algorithms and their variants on GOT-10k and analyze their results in this paper. (6) Finally, we develop a comprehensive platform for the tracking community that offers full-featured evaluation toolkits, an online evaluation server, and a responsive leaderboard. The annotations of GOT-10k's test data are kept private to avoid tuning parameters on it.
引用
收藏
页码:1562 / 1577
页数:16
相关论文
共 82 条
[1]  
Bao CL, 2012, PROC CVPR IEEE, P1830, DOI 10.1109/CVPR.2012.6247881
[2]  
Bazzani L., 2011, P 28 INT C MACH LEAR, P937, DOI [DOI 10.5555/3104482.3104600, 10.5555/3104482.3104600]
[3]   Sharing Representations for Long Tail Computer Vision Problems [J].
Bengio, Samy .
ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, :1-1
[4]   Staple: Complementary Learners for Real-Time Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Golodetz, Stuart ;
Miksik, Ondrej ;
Torr, Philip H. S. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1401-1409
[5]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[6]   Unveiling the Power of Deep Tracking [J].
Bhat, Goutam ;
Johnander, Joakim ;
Danelljan, Martin ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :493-509
[7]  
Bolme DS, 2010, PROC CVPR IEEE, P2544, DOI 10.1109/CVPR.2010.5539960
[8]   Towards dense object tracking in a 2D honeybee hive [J].
Bozek, Katarzyna ;
Hebert, Laetitia ;
Mikheyev, Alexander S. ;
Stephens, Greg J. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4185-4193
[9]  
Cehovin L, 2016, IEEE WINT CONF APPL
[10]   Visual Object Tracking Performance Measures Revisited [J].
Cehovin, Luka ;
Leonardis, Ales ;
Kristan, Matej .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (03) :1261-1274