DeepTag: A General Framework for Fiducial Marker Design and Detection

被引:12
作者
Zhang, Zhuming [1 ]
Hu, Yongtao [1 ]
Yu, Guoxing [1 ]
Dai, Jingwen [1 ]
机构
[1] Guangdong Virtual Real Technol Co Ltd, X Lab, Shenzhen 518000, Guangdong, Peoples R China
关键词
Symbols; Image coding; Convolutional neural networks; Training data; Robustness; Pose estimation; Detection algorithms; Fiducial marker; deep learning; object detection; marker design; monocular pose estimation;
D O I
10.1109/TPAMI.2022.3174603
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fiducial marker system usually consists of markers, a detection algorithm, and a coding system. The appearance of markers and the detection robustness are generally limited by the existing detection algorithms, which are hand-crafted with traditional low-level image processing techniques. Furthermore, a sophisticatedly designed coding system is required to overcome the shortcomings of both markers and detection algorithms. To improve the flexibility and robustness in various applications, we propose a general deep learning based framework, DeepTag, for fiducial marker design and detection. DeepTag not only supports detection of a wide variety of existing marker families, but also makes it possible to design new marker families with customized local patterns. Moreover, we propose an effective procedure to synthesize training data on the fly without manual annotations. Thus, DeepTag can easily adapt to existing and newly-designed marker families. To validate DeepTag and existing methods, beside existing datasets, we further collect a new large and challenging dataset where markers are placed in different view distances and angles. Experiments show that DeepTag well supports different marker families and greatly outperforms the existing methods in terms of both detection robustness and pose accuracy. Both code and dataset are available at https://herohuyongtao.github.io/research/publications/deep-tag/.
引用
收藏
页码:2931 / 2944
页数:14
相关论文
共 43 条
[1]  
Anqi Xu, 2011, 2011 Canadian Conference on Computer and Robot Vision (CRV), P40, DOI 10.1109/CRV.2011.13
[2]   An Accurate and Robust Artificial Marker Based on Cyclic Codes [J].
Bergamasco, Filippo ;
Albarelli, Andrea ;
Cosmo, Luca ;
Rodola, Emanuele ;
Torsello, Andrea .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (12) :2359-2373
[3]  
Bergamasco F, 2011, PROC CVPR IEEE, P113, DOI 10.1109/CVPR.2011.5995544
[4]   CrowdNet: A Deep Convolutional Network for Dense Crowd Counting [J].
Boominathan, Lokesh ;
Kruthiventi, Srinivas S. S. ;
Babu, R. Venkatesh .
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, :640-644
[5]   How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks) [J].
Bulat, Adrian ;
Tzimiropoulos, Georgios .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1021-1030
[6]   OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [J].
Cao, Zhe ;
Hidalgo, Gines ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) :172-186
[7]  
Cho Y., 1998, INT WORKSHOP AUGMENT, P147
[8]   ChromaTag: A Colored Marker and Fast Detection Algorithm [J].
DeGol, Joseph ;
Bretl, Timothy ;
Hoiem, Derek .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1481-1490
[9]  
Fiala M, 2005, PROC CVPR IEEE, P590
[10]   Designing Highly Reliable Fiducial Markers [J].
Fiala, Mark .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (07) :1317-1324