TTNet: Real-time temporal and spatial video analysis of table tennis

被引:49
作者
Voeikov, Roman [1 ]
Falaleev, Nikolay [1 ]
Baikulov, Ruslan [1 ]
机构
[1] OSAI, Moscow, Russia
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年
关键词
D O I
10.1109/CVPRW50498.2020.00450
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a neural network TTNet aimed at real-time processing of high-resolution table tennis videos, providing both temporal (events spotting) and spatial (ball detection and semantic segmentation) data. This approach gives core information for reasoning score updates by an auto-referee system. We also publish a multi-task dataset OpenTTGames with videos of table tennis games in 120 fps labeled with events, semantic segmentation masks, and ball coordinates for evaluation of multi-task approaches, primarily oriented on spotting of quick events and small objects tracking. TTNet demonstrated 97.0% accuracy in game events spotting along with 2 pixels RMSE in ball detection with 97.5% accuracy on the test part of the presented dataset. The proposed network allows the processing of down-scaled full HD videos with inference time below 6 ms per input tensor on a machine with a single consumer-grade GPU. Thus, we are contributing to the development of real-time multi-task deep learning applications and presenting approach, which is potentially capable of substituting manual data collection by sports scouts, providing support for referees' decision-making, and gathering extra information about the game process.
引用
收藏
页码:3857 / 3865
页数:9
相关论文
共 22 条
[1]  
[Anonymous], 2016, 20 INT C IM PROC COM
[2]  
[Anonymous], 2017, LNCS LNAI, DOI DOI 10.1007/978-3
[3]  
[Anonymous], 2017, CoRR
[4]  
Bridgeman Lewis, 2019, IEEE C COMP VIS PATT
[5]   SST: Single-Stream Temporal Action Proposals [J].
Buch, Shyamal ;
Escorcia, Victor ;
Shen, Chuanqi ;
Ghanem, Bernard ;
Niebles, Juan Carlos .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6373-6382
[6]  
Cai Zixi, 2018, CORR
[7]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[8]   Deep Networks with Stochastic Depth [J].
Huang, Gao ;
Sun, Yu ;
Liu, Zhuang ;
Sedra, Daniel ;
Weinberger, Kilian Q. .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :646-661
[9]   A deep learning ball tracking system in soccer videos [J].
Kamble, P. R. ;
Keskar, A. G. ;
Bhurchandi, K. M. .
OPTO-ELECTRONICS REVIEW, 2019, 27 (01) :58-69
[10]   Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics [J].
Kendall, Alex ;
Gal, Yarin ;
Cipolla, Roberto .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7482-7491