A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery

被引:23
作者
Wang, Hao [1 ,2 ]
Zhang, Zhaoyue [3 ]
Li, Haicheng [1 ,2 ]
Li, Jinzhao [1 ]
Li, Hanshuang [1 ]
Liu, Mingzhu [1 ,2 ]
Liang, Pengfei [1 ]
Xi, Qilemuge [1 ]
Xing, Yongqiang [4 ]
Yang, Lei [5 ]
Zuo, Yongchun [1 ,2 ]
机构
[1] Inner Mongolia Univ, Coll Life Sci, State Key Lab Reprod Regulat & Breeding Grassland, Hohhot 010070, Peoples R China
[2] Inner Mongolia Wesure Date Technol Co Ltd, Inner Mongolia Intelligent Union Big Data Acad, Digital Coll, Hohhot 010010, Peoples R China
[3] Univ Elect Sci & Technol China, Ctr Informat Biol, Sch Life Sci & Technol, Chengdu 610054, Peoples R China
[4] Inner Mongolia Univ Sci & Technol, Sch Life Sci & Technol, Baotou 014010, Peoples R China
[5] Harbin Med Univ, Coll Bioinformat Sci & Technol, Harbin 150081, Peoples R China
关键词
Preeclampsia risk; Machine learning; Feature selection; Marker genes; Web server; SINGLE-CELL; CANCER CLASSIFICATION; DIFFERENTIATION; EXPRESSION; IDENTIFICATION; PREDICTION;
D O I
10.1186/s13578-023-00991-y
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays a vital role in the formulation of treatment plans. However, the traditional clinical methods of PE have a high misdiagnosis rate.Results Here, we first designed a computational biology method that used single-cell transcriptome (scRNA-seq) of healthy pregnancy (38 wk) and early-onset PE (28-32 wk) to identify pathological cell subpopulations and predict PE risk. Based on machine learning methods and feature selection techniques, we observed that the Tuning ReliefF (TURF) score hybrid with XGBoost (TURF_XGB) achieved optimal performance, with 92.61% accuracy and 92.46% recall for classifying nine cell subpopulations of healthy placentas. Biological landscapes of placenta heterogeneity could be mapped by the 110 marker genes screened by TURF_XGB, which revealed the superiority of the TURF feature mining. Moreover, we processed the PE dataset with LASSO to obtain 497 biomarkers. Integration analysis of the above two gene sets revealed that dendritic cells were closely associated with early-onset PE, and C1QB and C1QC might drive preeclampsia by mediating inflammation. In addition, an ensemble model-based risk stratification card was developed to classify preeclampsia patients, and its area under the receiver operating characteristic curve (AUC) could reach 0.99. For broader accessibility, we designed an accessible online web server ().Conclusion Single-cell transcriptome-based preeclampsia risk assessment using an ensemble machine learning framework is a valuable asset for clinical decision-making. C1QB and C1QC may be involved in the development and progression of early-onset PE by affecting the complement and coagulation cascades pathway that mediate inflammation, which has important implications for better understanding the pathogenesis of PE.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Driver Risk Assessment Using Traffic Violation and Accident Data by Machine Learning Approaches
    Fang, Aifen
    Qiu, Chenlu
    Zhao, Lei
    Jin, Yongjun
    2018 3RD IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION ENGINEERING (ICITE), 2018, : 291 - 295
  • [42] An Effective Feature Selection Algorithm for Machine Learning-based Malicious Traffic Detection
    Fei, Chao
    Xia, Nian
    Tsai, Pang-Wei
    Lu, Yang
    Pan, Xiaonan
    Gong, Junli
    2024 19TH ASIA JOINT CONFERENCE ON INFORMATION SECURITY, ASIAJCIS 2024, 2024, : 91 - 98
  • [43] An empirical analysis of data preprocessing for machine learning-based software cost estimation
    Huang, Jianglin
    Li, Yan-Fu
    Xie, Min
    INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 67 : 108 - 127
  • [44] Cost-effective broad learning-based ultrasound biomicroscopy with 3D reconstruction for ocular anterior segmentation
    Ali, Saba Ghazanfar
    Chen, Yan
    Sheng, Bin
    Li, Huating
    Wu, Qiang
    Yang, Po
    Muhammad, Khan
    Yang, Geng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 35105 - 35122
  • [45] Cost-effective broad learning-based ultrasound biomicroscopy with 3D reconstruction for ocular anterior segmentation
    Saba Ghazanfar Ali
    Yan Chen
    Bin Sheng
    Huating Li
    Qiang Wu
    Po Yang
    Khan Muhammad
    Geng Yang
    Multimedia Tools and Applications, 2021, 80 : 35105 - 35122
  • [46] Machine Learning-Based Analysis of Cryptocurrency Market Financial Risk Management
    Shahbazi, Zeinab
    Byun, Yung-Cheol
    IEEE ACCESS, 2022, 10 : 37848 - 37856
  • [47] Machine learning-based farm risk management: A systematic mapping review
    Ghaffarian, Saman
    van der Voort, Mariska
    Valente, Joao
    Tekinerdogan, Bedir
    de Mey, Yann
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 192
  • [48] Discovery of alkaline laccases from basidiomycete fungi through machine learning-based approach
    Wan, Xing
    Shahrear, Sazzad
    Chew, Shea Wen
    Vilaplana, Francisco
    Makela, Miia R.
    BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS, 2024, 17 (01):
  • [49] Machine Learning-Based Decision-Making Mechanism for Risk Assessment of Cardiovascular Disease
    Wang, Cheng
    Zhu, Haoran
    Rao, Congjun
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 138 (01): : 691 - 718
  • [50] Machine learning-based risk assessment for cardiovascular diseases in patients with chronic lung diseases
    Xi, Huiming
    Kang, Qingxin
    Jiang, Xunsheng
    MEDICINE, 2025, 104 (10) : e41672