Basketball action recognition based on the combination of YOLO and a deep fuzzy LSTM network

被引:24
作者
Khobdeh, Soroush Babaee [1 ]
Yamaghani, Mohammad Reza [1 ]
Sareshkeh, Siavash Khodaparast [2 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Lahijan Branch, Lahijan, Iran
[2] Islamic Azad Univ, Dept Phys Educ & Sport Sci, Lahijan Branch, Lahijan, Iran
关键词
Action recognition; Basketball; Deep learning; YOLO; LSTM; Fuzzy layer; NEURAL-NETWORKS; REPRESENTATION; HISTOGRAMS; FEATURES; FLOW;
D O I
10.1007/s11227-023-05611-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The ability to identify human actions in uncontrolled surroundings is important for Human-Computer Interaction (HCI), especially in the sports areas to offer athletes, coaches, and analysts valuable knowledge about movement techniques and aid referees in making well-informed decisions regarding sports movements. Noteworthy, recognizing human actions in the context of basketball sports remains a difficult task due to issues like intricate backgrounds, obstructed actions, and inconsistent lighting conditions. Accordingly, a method based on the combination of YOLO and deep fuzzy LSTM network is proposed in this paper. YOLO is utilized for detecting players in the frame and the combination of LSTM and Fuzzy layer is used to perform the final classification. The reason behind using LSTM along with fuzzy logic refers to its inability in coping with uncertainty which led to the creation of a more transparent, interpretable, and accurate predictive system. The proposed model was validated on SpaceJam and Basketball-51 datasets. Based on the empirical results, the proposed model outperformed all baseline models on both datasets which obviously confirms the priority of our combinational model for basketball action recognition.
引用
收藏
页码:3528 / 3553
页数:26
相关论文
共 68 条
[1]  
ALIAKBARPOUR H, 2022, J SUPERCOMPUT, P1
[2]   ViViT: A Video Vision Transformer [J].
Arnab, Anurag ;
Dehghani, Mostafa ;
Heigold, Georg ;
Sun, Chen ;
Lucic, Mario ;
Schmid, Cordelia .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :6816-6826
[3]  
Babaee Khobdeh S., 2021, J APPL RES INDUST EN, V8, P412, DOI DOI 10.22105/JARIE.2021.276107.1270
[4]  
Bertasius G, 2021, PR MACH LEARN RES, V139
[5]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[6]  
Chen M.y., 2009, Mosift: Recognizing human actions in surveillance videos, P929
[7]   Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].
Chen, Yuxin ;
Zhang, Ziqi ;
Yuan, Chunfeng ;
Li, Bing ;
Deng, Ying ;
Hu, Weiming .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348
[8]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]   Human detection using oriented histograms of flow and appearance [J].
Dalal, Navneet ;
Triggs, Bill ;
Schmid, Cordelia .
COMPUTER VISION - ECCV 2006, PT 2, PROCEEDINGS, 2006, 3952 :428-441
[10]  
De Campos T., 2011, 2011 IEEE WORKSH APP