A systematic study of the performance of machine learning models on analyzing the association between semen quality and environmental pollutants

被引:0
|
作者
Lu, Lu [1 ]
Qian, Ying [2 ]
Dong, Yihang [3 ]
Su, Han [3 ]
Deng, Yunxin [3 ]
Zeng, Qiang [4 ]
Li, He [2 ]
机构
[1] Yale Univ, Dept Stat & Data Sci, New Haven, CT 06520 USA
[2] Univ Georgia, Sch Chem Mat & Biomed Engn, Athens, GA 30602 USA
[3] Brown Univ, Dept Comp Sci, Providence, RI USA
[4] Huazhong Univ Sci & Technol, Tongji Med Coll, Sch Publ Hlth, Dept Occupat & Environm Hlth, Wuhan, Hubei, Peoples R China
关键词
machine learning; artificial intelligence; quality of semen; neural networks; phthalates; POLYCYCLIC AROMATIC-HYDROCARBONS; URINARY PHTHALATE METABOLITES; ENDOCRINE-DISRUPTING COMPOUNDS; REPRODUCTIVE FUNCTION; DNA-DAMAGE; EXPOSURE; VARIABILITY; POPULATION; BIOMARKERS; PARAMETERS;
D O I
10.3389/fphy.2023.1259273
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Human exposure to Phthalates, a family of chemicals primarily used to enhance the flexibility and durability of plastics, could lead to a decline in semen quality. Extensive studies have been performed to investigate the associations between semen quality and exposure to environmental pollutants, such as phthalates. However, these early studies mainly focus on using conventional statistical methods, such as simple and efficient multi-variable linear regression methods, to perform the analysis, which may not be effective in analyzing these complex multi-variable associations. Herein, we perform a systematic study of the performance of different machine learning methods in analyzing these associations. We will use data from a cohort of 1070 Chinese males from Hubei province who provided repeated urine samples to measure phthalate metabolites. In addition, phthalate metabolites in semen are also evaluated as a biomarker to give a more direct metric. We also incorporate patient demographics and administered medications into the analysis. Overall, six machine learning models, including linear and non-linear models, are implemented to analyze associations among thirty-one features and five metrics of the quality of the semen. The performance of the models is evaluated based on root-mean-square deviation through 10-fold cross-validation. Our investigations show that the performance of different models is varied when employed to study different metrics that represent the quality of the semen. Therefore, a systematic study of the patients' data with various machine learning models is essential in improving the quantitative analysis in discovering the critical environmental pollutants that dictate the quality of semen. We hope this study could provide guidance of employing machine learning models in the future investigation of the impact of various pollutants on semen quality.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Human semen quality and environmental and occupational exposure to pollutants: A systematic review
    Cofone, L.
    Pindinello, I.
    D'Ancona, G.
    Grassi, F.
    Antonucci, A.
    Vitali, M.
    Protano, C.
    ANNALI DI IGIENE MEDICINA PREVENTIVA E DI COMUNITA, 2023, 35 (06): : 660 - 669
  • [2] Association between environmental exposure to cadmium and human semen quality
    Li, Yuyan
    Wu, Junqing
    Zhou, Weijin
    Gao, Ersheng
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL HEALTH RESEARCH, 2016, 26 (02) : 175 - 186
  • [3] Machine-Learning Models for Software Quality: a Compromise Between Performance and Intelligibility
    Lounis, Hakim
    Gayed, Tamer Fares
    Boukadoum, Mounir
    2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 919 - 921
  • [4] Analyzing Effective Factors of Online Learning Performance by Interpreting Machine Learning Models
    Xiao, Wen
    Hu, Juan
    IEEE ACCESS, 2023, 11 : 132435 - 132447
  • [5] The prediction of semen quality based on lifestyle behaviours by the machine learning based models
    Aykac, Aykut
    Kaya, Coskun
    Celik, Ozer
    Aydin, Mehmet Erhan
    Sungur, Mustafa
    REPRODUCTIVE BIOLOGY AND ENDOCRINOLOGY, 2024, 22 (01)
  • [6] Analyzing the Performance of Univariate and Multivariate Machine Learning Models in Soil Movement Prediction: A Comparative Study
    Kumar, Praveen
    Priyanka, P.
    Dhanya, J.
    Uday, Kala Venkata
    Dutt, Varun
    IEEE ACCESS, 2023, 11 : 62368 - 62381
  • [7] The association between polycystic ovary syndrome and environmental pollutants based on animal and human study; a systematic review
    Ghanati, Kiandokht
    Jahanbakhsh, Mahdi
    Shakoori, Attaollah
    Aghebat-Bekheir, Saeed
    Khalili-Rikabadi, Ali
    Sadighara, Parisa
    REVIEWS ON ENVIRONMENTAL HEALTH, 2024, 39 (04) : 651 - 657
  • [8] Association between meteorological variables and semen quality: a retrospective study
    Gustavo Luis Verón
    Andrea Daniela Tissera
    Ricardo Bello
    Gustavo Martín Estofan
    Mariana Hernández
    Fernando Beltramone
    Rosa Isabel Molina
    Mónica Hebe Vazquez-Levin
    International Journal of Biometeorology, 2021, 65 : 1399 - 1414
  • [9] Association between environmental exposure to p, p′-DDE and lindane and semen quality
    Pant, Niraj
    Shukla, M.
    Upadhyay, A. D.
    Chaturvedi, P. K.
    Saxena, D. K.
    Gupta, Y. K.
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2014, 21 (18) : 11009 - 11016
  • [10] Association between meteorological variables and semen quality: a retrospective study
    Veron, Gustavo Luis
    Tissera, Andrea Daniela
    Bello, Ricardo
    Estofan, Gustavo Martin
    Hernandez, Mariana
    Beltramone, Fernando
    Molina, Rosa Isabel
    Vazquez-Levin, Monica Hebe
    INTERNATIONAL JOURNAL OF BIOMETEOROLOGY, 2021, 65 (08) : 1399 - 1414