A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery

被引:23
|
作者
Wang, Hao [1 ,2 ]
Zhang, Zhaoyue [3 ]
Li, Haicheng [1 ,2 ]
Li, Jinzhao [1 ]
Li, Hanshuang [1 ]
Liu, Mingzhu [1 ,2 ]
Liang, Pengfei [1 ]
Xi, Qilemuge [1 ]
Xing, Yongqiang [4 ]
Yang, Lei [5 ]
Zuo, Yongchun [1 ,2 ]
机构
[1] Inner Mongolia Univ, Coll Life Sci, State Key Lab Reprod Regulat & Breeding Grassland, Hohhot 010070, Peoples R China
[2] Inner Mongolia Wesure Date Technol Co Ltd, Inner Mongolia Intelligent Union Big Data Acad, Digital Coll, Hohhot 010010, Peoples R China
[3] Univ Elect Sci & Technol China, Ctr Informat Biol, Sch Life Sci & Technol, Chengdu 610054, Peoples R China
[4] Inner Mongolia Univ Sci & Technol, Sch Life Sci & Technol, Baotou 014010, Peoples R China
[5] Harbin Med Univ, Coll Bioinformat Sci & Technol, Harbin 150081, Peoples R China
关键词
Preeclampsia risk; Machine learning; Feature selection; Marker genes; Web server; SINGLE-CELL; CANCER CLASSIFICATION; DIFFERENTIATION; EXPRESSION; IDENTIFICATION; PREDICTION;
D O I
10.1186/s13578-023-00991-y
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays a vital role in the formulation of treatment plans. However, the traditional clinical methods of PE have a high misdiagnosis rate.Results Here, we first designed a computational biology method that used single-cell transcriptome (scRNA-seq) of healthy pregnancy (38 wk) and early-onset PE (28-32 wk) to identify pathological cell subpopulations and predict PE risk. Based on machine learning methods and feature selection techniques, we observed that the Tuning ReliefF (TURF) score hybrid with XGBoost (TURF_XGB) achieved optimal performance, with 92.61% accuracy and 92.46% recall for classifying nine cell subpopulations of healthy placentas. Biological landscapes of placenta heterogeneity could be mapped by the 110 marker genes screened by TURF_XGB, which revealed the superiority of the TURF feature mining. Moreover, we processed the PE dataset with LASSO to obtain 497 biomarkers. Integration analysis of the above two gene sets revealed that dendritic cells were closely associated with early-onset PE, and C1QB and C1QC might drive preeclampsia by mediating inflammation. In addition, an ensemble model-based risk stratification card was developed to classify preeclampsia patients, and its area under the receiver operating characteristic curve (AUC) could reach 0.99. For broader accessibility, we designed an accessible online web server ().Conclusion Single-cell transcriptome-based preeclampsia risk assessment using an ensemble machine learning framework is a valuable asset for clinical decision-making. C1QB and C1QC may be involved in the development and progression of early-onset PE by affecting the complement and coagulation cascades pathway that mediate inflammation, which has important implications for better understanding the pathogenesis of PE.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Cobdock: an accurate and practical machine learning-based consensus blind docking method
    Ugurlu, Sadettin Y.
    Mcdonald, David
    Lei, Huangshu
    Jones, Alan M.
    Li, Shu
    Tong, Henry Y.
    Butler, Mark S.
    He, Shan
    JOURNAL OF CHEMINFORMATICS, 2024, 16 (01)
  • [22] Deciphering Factors Contributing to Cost-Effective Medicine Using Machine Learning
    Long, Bowen
    Zhou, Jinfeng
    Tan, Fangya
    Bellur, Srikar
    BIOENGINEERING-BASEL, 2024, 11 (08):
  • [23] Machine learning-based monitoring method for the electricity consumption of a healthcare facility in Italy
    Zini, Marco
    Carcasci, Carlo
    ENERGY, 2023, 262
  • [24] Machine Learning-Based identification of resistance genes associated with sunflower broomrape
    Yingxue Che
    Congzi Zhang
    Jixiang Xing
    Qilemuge Xi
    Ying Shao
    Lingmin Zhao
    Shuchun Guo
    Yongchun Zuo
    Plant Methods, 21 (1)
  • [25] A machine learning-based framework for cost-optimal building retrofit
    Deb, Chirag
    Dai, Zhonghao
    Schlueter, Arno
    APPLIED ENERGY, 2021, 294
  • [26] Machine Learning-Based Risk Prediction of Discharge Status for Sepsis
    Cai, Kaida
    Lou, Yuqing
    Wang, Zhengyan
    Yang, Xiaofang
    Zhao, Xin
    ENTROPY, 2024, 26 (08)
  • [27] A Machine Learning-Based Evaluation Method for Machine Translation
    Kotani, Katsunori
    Yoshimi, Takehiko
    ARTIFICIAL INTELLIGENCE: THEORIES, MODELS AND APPLICATIONS, PROCEEDINGS, 2010, 6040 : 351 - +
  • [28] Identification of driver genes in lupus nephritis based on comprehensive bioinformatics and machine learning
    Wang, Zheng
    Hu, Danni
    Pei, Guangchang
    Zeng, Rui
    Yao, Ying
    FRONTIERS IN IMMUNOLOGY, 2023, 14
  • [29] A novel machine learning-based approach for the risk assessment of nitrate groundwater contamination
    Sajedi-Hosseini, Farzaneh
    Malekian, Arash
    Choubin, Bahram
    Rahmati, Omid
    Cipullo, Sabrina
    Coulon, Frederic
    Pradhan, Biswajeet
    SCIENCE OF THE TOTAL ENVIRONMENT, 2018, 644 : 954 - 962
  • [30] Induced bioresistance via BNP detection for machine learning-based risk assessment
    So, Seth
    Khalaf, Aya
    Yi, Xinruo
    Herring, Connor
    Zhang, Yingze
    Simon, Marc A.
    Akcakaya, Murat
    Lee, SeungHee
    Yun, Minhee
    BIOSENSORS & BIOELECTRONICS, 2021, 175