Understanding gender differences in professional European football through machine learning interpretability and match actions data

被引:0
|
作者
Marc Garnica-Caparrós
Daniel Memmert
机构
[1] German Sport University Cologne,Institute of Training and Computer Science in Sport
来源
Scientific Reports | / 11卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
After the great success of the Women’s World Cup in 2019, several platforms have started identifying the reasons for gender inequality in European football. Even though these inequalities emerge from a variety of key aspects in the modern sport, we focused on the game and evaluated the main differential features of European male and female football players in match actions data under the assumption of finding significant differences and established patterns between genders. A methodology for unbiased feature extraction and objective analysis is presented based on data integration and machine learning explainability algorithms. Female (n0=1511\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_0 = 1511$$\end{document}) and male (n1=2703\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_1 = 2703$$\end{document}) data points were collected from event data and categorized by game period and player position. We set up a supervised classification pipeline to predict the gender of each player by looking at their actions in the game. The comparison methodology did not include any qualitative enrichment or subjective analysis to prevent biased data enhancement or gender-related processing. The pipeline included three representative binary classification models; A logic-based Decision Trees, a probabilistic Logistic Regression and a multilevel perceptron Neural Network. Each model tried to draw the differences between male and female data points, and we extracted the results using machine learning explainability methods to understand the underlying mechanics of the models implemented. The study was able to determine pivotal factors that differentiate each gender performance as well as disseminate unique patterns by gender involving more than one indicator. Data enhancement and critical variables analysis are essential next steps to support this framework and serve as a baseline for further studies and training developments.
引用
收藏
相关论文
共 26 条
  • [11] Gender Identification Through Facebook Data Analysis Using Machine Learning Techniques
    Kiratsa, P. I.
    Sidiropoulos, G. K.
    Badeka, E. V.
    Papadopoulou, C. I.
    Nikolaou, A. P.
    Papakostas, G. A.
    22ND PAN-HELLENIC CONFERENCE ON INFORMATICS (PCI 2018), 2018, : 117 - 120
  • [12] Supervised and unsupervised machine learning for gender identification through hand's anthropometric data
    Hida, Nahid
    Abid, Mohamed
    Lakrad, Faouzi
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2020, 12 (03) : 337 - 355
  • [13] Understanding protein dispensability through machine-learning analysis of high-throughput data
    Chen, Y
    Xu, D
    BIOINFORMATICS, 2005, 21 (05) : 575 - 581
  • [14] Data-Centric Machine Learning: Improving Model Performance and Understanding Through Dataset Analysis
    Westermann, Hannes
    Savelka, Jaromir
    Walker, Vern R.
    Ashley, Kevin D.
    Benyekhlef, Karim
    LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 346 : 54 - 57
  • [15] Understanding California′s COVID-19 Variant Data through Unsupervised Machine Learning Analysis
    Cheng, C.
    Cheng, C.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2023, 25 (11): : S84 - S84
  • [16] Improving the understanding of web user behaviors through machine learning analysis of eye-tracking data
    Diana Castilla
    Omar Del Tejo Catalá
    Patricia Pons
    François Signol
    Beatriz Rey
    Carlos Suso-Ribera
    Juan-Carlos Perez-Cortes
    User Modeling and User-Adapted Interaction, 2024, 34 : 293 - 322
  • [17] Improving the understanding of web user behaviors through machine learning analysis of eye-tracking data
    Castilla, Diana
    Del Tejo Catala, Omar
    Pons, Patricia
    Signol, Francois
    Rey, Beatriz
    Suso-Ribera, Carlos
    Perez-Cortes, Juan-Carlos
    USER MODELING AND USER-ADAPTED INTERACTION, 2024, 34 (02) : 293 - 322
  • [18] Data-driven detection of counterpressing in professional football A supervised machine learning task based on synchronized positional and event data with expert-based feature extraction
    Bauer, Pascal
    Anzer, Gabriel
    DATA MINING AND KNOWLEDGE DISCOVERY, 2021, 35 (05) : 2009 - 2049
  • [19] Understanding hydrologic controls of sloping soil response to precipitation through machine learning analysis applied to synthetic data
    Quintero, Daniel Camilo Roman
    Marino, Pasquale
    Santonastaso, Giovanni Francesco
    Greco, Roberto
    HYDROLOGY AND EARTH SYSTEM SCIENCES, 2023, 27 (22) : 4151 - 4172
  • [20] Exploring Gender Differences in Computational Thinking Learning in a VR Classroom: Developing Machine Learning Models Using Eye-Tracking Data and Explaining the Models
    Hong Gao
    Lisa Hasenbein
    Efe Bozkir
    Richard Göllner
    Enkelejda Kasneci
    International Journal of Artificial Intelligence in Education, 2023, 33 : 929 - 954