Multimodal distribution and its impact on the accurate assessment of spermatozoa morphological data: Lessons from machine learning

被引:0
|
作者
Stefanovski, D. [1 ]
Schulze, M. [2 ]
Althouse, G. C. [1 ]
机构
[1] Univ Penn, Sch Vet Med, New Bolton Ctr, Dept Clin Studies, Kennett Sq, PA USA
[2] Inst Reprod Farm Anim Schonow, Bernauer Allee 10, D-16321 Bernau, Germany
关键词
Sperm morphology; Cytoplasmic droplet rates; Finite mixture modeling; Latent class analysis; Heterogeneity; Data simulation; MIXTURE-MODELS;
D O I
10.1016/j.anireprosci.2024.107564
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Objective assessment of sperm morphology is an essential component for assessing ejaculate quality. Due to economic limitations, investigators often divert to conducting observational studies instead of experimental ones, which provide the strongest statistical power, yielding more heterogeneous data regardless of the number of data sources (barns/farms). Using such data inevitably leads to higher variances of estimates, which negatively impacts the statistical power of a study. In this article, we describe a statistical methodology called finite mixture modeling (FMM), which, based on the supplied data and assumed number of sub-classes, classifies the data into two or more homogeneous types of distributions and determines their fractional size relative to the entire cohort. The goal is to use statistical methods that will confound the variance of the sample. A figure from a previous publication was used to generate simulated data (n=1559) on the cytoplasmic droplet rate. We identified that a bi-modal distribution with two latent classes best described the simulated data. Post-hoc estimation showed that about 80 % of observations belonged to latent class 1, with 20 % in latent class 2. The FMM methodology identified a cutoff point of 8.7 %. Finally, when estimating the standard error for the total cohort, the FMM methodology yielded a 40 % reduction in the standard error compared to standard methodologies. In conclusion, here we show that FMM successfully confounded the variance of the data and, as such, yielded lower estimates of the variance than standard methodologies, increasing the statistical power of the cohort.
引用
收藏
页数:6
相关论文
共 50 条
  • [2] Multimodal machine learning to predict surgical site infection with healthcare workload impact assessment
    Mclean, Kenneth A.
    Sgro, Alessandro
    Brown, Leo R.
    Buijs, Louis F.
    Mountain, Katie E.
    Shaw, Catherine A.
    Drake, Thomas M.
    Pius, Riinu
    Knight, Stephen R.
    Fairfield, Cameron J.
    Skipworth, Richard J. E.
    Tsaftaris, Sotirios A.
    Wigmore, Stephen J.
    Potter, Mark A.
    Bouamrane, Matt-Mouley
    Harrison, Ewen M.
    Baweja, K.
    Cambridge, W. A.
    Chauhan, V.
    Czyzykowska, K.
    Edirisooriya, M.
    Forsyth, A.
    Fox, B.
    Fretwell, J.
    Gent, C.
    Gherman, A.
    Green, L.
    Grewar, J.
    Heelan, S.
    Henshall, D.
    Iiuoma, C.
    Jayasangaran, S.
    Johnston, C.
    Kennedy, E.
    Kremel, D.
    Kung, J.
    Kwong, J.
    Leavy, C.
    Liu, J.
    Mackay, S.
    Macnamara, A.
    Mowitt, S.
    Musenga, E.
    Ng, N.
    Ng, Z. H.
    O'Neill, S.
    Ramage, M.
    Reed, J.
    Riad, A.
    Scott, C.
    NPJ DIGITAL MEDICINE, 2025, 8 (01):
  • [3] Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning
    Gebru, Timnit
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3609 - 3609
  • [4] Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning
    Jo, Eun Seo
    Gebru, Timnit
    FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, : 306 - 316
  • [5] Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments
    Ishikawa, Fuyuki
    CONCEPTUAL MODELING, ER 2018, 2018, 11157 : 536 - 544
  • [6] Grape leaf moisture prediction from UAVs using multimodal data fusion and machine learning
    Xuelian Peng
    Yuxin Ma
    Jun Sun
    Dianyu Chen
    Jingbo Zhen
    Zhitao Zhang
    Xiaotao Hu
    Yakun Wang
    Precision Agriculture, 2024, 25 : 1609 - 1635
  • [7] Grape leaf moisture prediction from UAVs using multimodal data fusion and machine learning
    Peng, Xuelian
    Ma, Yuxin
    Sun, Jun
    Chen, Dianyu
    Zhen, Jingbo
    Zhang, Zhitao
    Hu, Xiaotao
    Wang, Yakun
    PRECISION AGRICULTURE, 2024, 25 (03) : 1609 - 1635
  • [8] Impact of sedimentary facies on machine learning of acoustic impedance from seismic data: Lessons from a geologically realistic 3D model
    Zeng, Hongliu
    He, Yawen
    Zeng, Leo
    INTERPRETATION-A JOURNAL OF SUBSURFACE CHARACTERIZATION, 2021, 9 (03): : T1009 - T1024
  • [9] Accurate detection of dilated cardiomyopathy onset through machine learning predictions from ECG data
    Sieliwonczyk, E.
    Sau, A.
    Liang, Y.
    Patlatzoglou, K.
    Mcgurk, K.
    Jennings, E.
    Bilgehan, N.
    Pastika, L.
    Curran, L.
    Buchan, R.
    Ge, J.
    Kramer, D.
    Waks, J.
    Ng, F. S.
    Ware, J.
    EUROPEAN HEART JOURNAL, 2024, 45
  • [10] The Interplay Between Big Data and Sparsity in Systems Identification: Some Lessons from Machine Learning
    Cheng, Y.
    Wang, Y.
    Camps, O.
    Sznaier, M.
    IFAC PAPERSONLINE, 2015, 48 (28): : 1285 - 1292