Multimodal distribution and its impact on the accurate assessment of spermatozoa morphological data: Lessons from machine learning

被引:0
|
作者
Stefanovski, D. [1 ]
Schulze, M. [2 ]
Althouse, G. C. [1 ]
机构
[1] Univ Penn, Sch Vet Med, New Bolton Ctr, Dept Clin Studies, Kennett Sq, PA USA
[2] Inst Reprod Farm Anim Schonow, Bernauer Allee 10, D-16321 Bernau, Germany
关键词
Sperm morphology; Cytoplasmic droplet rates; Finite mixture modeling; Latent class analysis; Heterogeneity; Data simulation; MIXTURE-MODELS;
D O I
10.1016/j.anireprosci.2024.107564
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Objective assessment of sperm morphology is an essential component for assessing ejaculate quality. Due to economic limitations, investigators often divert to conducting observational studies instead of experimental ones, which provide the strongest statistical power, yielding more heterogeneous data regardless of the number of data sources (barns/farms). Using such data inevitably leads to higher variances of estimates, which negatively impacts the statistical power of a study. In this article, we describe a statistical methodology called finite mixture modeling (FMM), which, based on the supplied data and assumed number of sub-classes, classifies the data into two or more homogeneous types of distributions and determines their fractional size relative to the entire cohort. The goal is to use statistical methods that will confound the variance of the sample. A figure from a previous publication was used to generate simulated data (n=1559) on the cytoplasmic droplet rate. We identified that a bi-modal distribution with two latent classes best described the simulated data. Post-hoc estimation showed that about 80 % of observations belonged to latent class 1, with 20 % in latent class 2. The FMM methodology identified a cutoff point of 8.7 %. Finally, when estimating the standard error for the total cohort, the FMM methodology yielded a 40 % reduction in the standard error compared to standard methodologies. In conclusion, here we show that FMM successfully confounded the variance of the data and, as such, yielded lower estimates of the variance than standard methodologies, increasing the statistical power of the cohort.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Multimodal Machine Learning-based Knee Osteoarthritis Progression Prediction from Plain Radiographs and Clinical Data
    Tiulpin, Aleksei
    Klein, Stefan
    Bierma-Zeinstra, Sita M. A.
    Thevenot, Jerome
    Rahtu, Esa
    van Meurs, Joyce
    Oei, Edwin H. G.
    Saarakkala, Simo
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [32] High Stakes Testing Cancellation and Its Impact on EFL Teaching and Learning: Lessons from Indonesia
    Ashadi
    Margana
    Mukminatun, Siti
    Utami, Amrih Bekti
    IJOLE-INTERNATIONAL JOURNAL OF LANGUAGE EDUCATION, 2022, 6 (04): : 397 - 411
  • [33] Impact of Machine Learning Pipeline Choices in Autism Prediction From Functional Connectivity Data
    Grana, Manuel
    Silva, Moises
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2021, 31 (04)
  • [34] From Data to Decision: Exploring Machine Learning's Impact on Shaping Smart Cities
    Adraoui, Meriem
    Diop, El Bachir
    Azmi, Rida
    Chenal, Jerome
    Abdem, Seyid Abdellahi Ebnou
    DIGITAL TECHNOLOGIES AND APPLICATIONS, ICDTA 2024, VOL 2, 2024, 1099 : 3 - 16
  • [35] The value of data, machine learning, and deep learning in restaurant demand forecasting: Insights and lessons learned from a large restaurant chain
    Chae, Bongsug
    Sheu, Chwen
    Park, Eunhye Olivia
    DECISION SUPPORT SYSTEMS, 2024, 184
  • [36] Mapping distribution of woody plant species richness from field rapid assessment and machine learning
    Perng, Bo-Hao
    Lam, Tzeng Yih
    Cheng, Su-Ting
    Su, Sheng-Hsin
    Anderson-Teixeira, Kristina J.
    Bourg, Norman A.
    Burslem, David F. R. P.
    Castano, Nicolas
    Duque, Alvaro
    Ediriweera, Sisira
    Gunatilleke, Nimal
    Lutz, James A.
    Mcshea, William J.
    Sabri, Mohamad Danial M. D.
    Novotny, Vojtech
    O'brien, Michael J.
    Reynolds, Glen
    Weiblen, George D.
    Zuleta, Daniel
    TAIWANIA, 2024, 69 (01) : 1 - 15
  • [37] Cleansing and Imputation of Body Mass Index Data and Its Impact on a Machine Learning Based Prediction Model
    Jauk, Stefanie
    Kramer, Diether
    Leodolter, Werner
    HEALTH INFORMATICS MEETS EHEALTH: BIOMEDICAL MEETS EHEALTH - FROM SENSORS TO DECISIONS, 2018, 248 : 116 - 123
  • [38] Impact of data processing and robust machine learning process on accurate estimation of specific heat capacity property in energy storage applications
    Adun, Humphrey
    Olusola, Bamisile
    Kavaz, Doga
    Dagbasi, Mustafa
    JOURNAL OF ENERGY STORAGE, 2022, 55
  • [39] Obtaining vertical distribution of PM2.5 from CALIOP data and machine learning algorithms
    Chen, Bin
    Song, Zhihao
    Pan, Feng
    Huang, Yue
    SCIENCE OF THE TOTAL ENVIRONMENT, 2022, 805
  • [40] Fusarium head blight monitoring in wheat ears using machine learning and multimodal data from asymptomatic to symptomatic periods
    Mustafa, Ghulam
    Zheng, Hengbiao
    Li, Wei
    Yin, Yuming
    Wang, Yongqing
    Zhou, Meng
    Liu, Peng
    Bilal, Muhammad
    Jia, Haiyan
    Li, Guoqiang
    Cheng, Tao
    Tian, Yongchao
    Cao, Weixing
    Zhu, Yan
    Yao, Xia
    FRONTIERS IN PLANT SCIENCE, 2023, 13