scE2EGAE: enhancing single-cell RNA-Seq data analysis through an end-to-end cell-graph-learnable graph autoencoder with differentiable edge sampling

被引:0
作者
Wang, Shuo [1 ,2 ]
Liu, Yuanning [1 ,2 ]
Zhang, Hao [1 ,2 ]
Liu, Zhen [3 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
[3] Nagasaki Inst Appl Sci, Grad Sch Engn, 536 Aba machi, Nagasaki, Japan
基金
中国国家自然科学基金;
关键词
Single-cell RNA-Seq; Bioinformatics; End-to-end; Graph neural networks; Deep learning; Autoencoder; MICROGLIA;
D O I
10.1186/s13062-025-00616-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background:Single-cell RNA sequencing (scRNA-Seq) technology reveals biological processes and molecular-level genomic information among individual cells. Numerous computational methods, including methods based on graph neural networks (GNNs), have been developed to enhance scRNA-Seq data analysis. However, existing GNNs-based methods usually construct fixed graphs by applying the k-nearest neighbors algorithm, which may result in information loss.Methods:To address this problem, we propose scE2EGAE, which learns cell graphs during the training processes. Firstly, the scRNA-Seq data is fed into a deep count autoencoder (DCA). Secondly, the hidden representations of DCA are extracted and then used to generate cell-to-cell graph edges through a straight-through estimator (STE) based on top-k sampling and Gumbel-Softmax. Finally, the generated cell-to-cell graph and scRNA-Seq data are fed into the GNNs-based downstream tasks. In this paper, we design a graph autoencoder which performs denoising on scRNA-Seq data as the downstream task.Results:We evaluate scE2EGAE on eight public scRNA-Seq datasets and compare its performance with seven existing scRNA-Seq data denoising methods. In this paper, extensive experiments are conducted, encompassing: 1) the evaluation of denoising performance, with metrics including mean absolute error, Pearson correlation coefficient, and cosine similarity; 2) the assessment of clustering performance of the denoised results, utilizing adjusted rand index, normalized mutual information and silhouette score; and 3) the evaluation of the cell trajectory inference performance of the denoised results, measured by the pseudo-temporal ordering score. The results show that, on the scRNA-Seq data denoising task, scE2EGAE outperforms most of the methods, proving that it can learn cell-to-cell graphs containing real information of cell-to-cell relationships.Conclusions:In this paper, we validate the proposed scE2EGAE method through its application to the denoising task of scRNA-Seq data. This method demonstrates its capability to learn inter-cellular relationships and construct cell-to-cell graphs, thereby enhancing the downstream analysis of scRNA-Seq data. Our approach can serve as an inspiration for future research on scRNA-Seq analysis methods based on GNNs, holding broad application prospects.
引用
收藏
页数:25
相关论文
共 68 条
[31]   Eleven grand challenges in single-cell data science [J].
Laehnemann, David ;
Koester, Johannes ;
Szczurek, Ewa ;
McCarthy, Davis J. ;
Hicks, Stephanie C. ;
Robinson, Mark D. ;
Vallejos, Catalina A. ;
Campbell, Kieran R. ;
Beerenwinkel, Niko ;
Mahfouz, Ahmed ;
Pinello, Luca ;
Skums, Pavel ;
Stamatakis, Alexandros ;
Attolini, Camille Stephan-Otto ;
Aparicio, Samuel ;
Baaijens, Jasmijn ;
Balvert, Marleen ;
de Barbanson, Buys ;
Cappuccio, Antonio ;
Corleone, Giacomo ;
Dutilh, Bas E. ;
Florescu, Maria ;
Guryev, Victor ;
Holmer, Rens ;
Jahn, Katharina ;
Lobo, Thamar Jessurun ;
Keizer, Emma M. ;
Khatri, Indu ;
Kielbasa, Szymon M. ;
Korbel, Jan O. ;
Kozlov, Alexey M. ;
Kuo, Tzu-Hao ;
Lelieveldt, Boudewijn P. F. ;
Mandoiu, Ion I. ;
Marioni, John C. ;
Marschall, Tobias ;
Moelder, Felix ;
Niknejad, Amir ;
Raczkowski, Lukasz ;
Reinders, Marcel ;
de Ridder, Jeroen ;
Saliba, Antoine-Emmanuel ;
Somarakis, Antonios ;
Stegle, Oliver ;
Theis, Fabian J. ;
Yang, Huan ;
Zelikovsky, Alex ;
McHardy, Alice C. ;
Raphael, Benjamin J. ;
Shah, Sohrab P. .
GENOME BIOLOGY, 2020, 21 (01)
[32]   A comparative strategy for single-nucleus and single-cell transcriptomes confirms accuracy in predicted cell-type expression from nuclear RNA [J].
Lake, Blue B. ;
Codeluppi, Simone ;
Yung, Yun C. ;
Gao, Derek ;
Chun, Jerold ;
Kharchenko, Peter V. ;
Linnarsson, Sten ;
Zhang, Kun .
SCIENTIFIC REPORTS, 2017, 7
[33]   PPM1K mediates metabolic disorder of branched-chain amino acid and regulates cerebral ischemia-reperfusion injury by activating ferroptosis in neurons [J].
Li, Tao ;
Zhao, Lili ;
Li, Ye ;
Dang, Meijuan ;
Lu, Jialiang ;
Lu, Ziwei ;
Huang, Qiao ;
Yang, Yang ;
Feng, Yuxuan ;
Wang, Xiaoya ;
Jian, Yating ;
Wang, Heying ;
Guo, Yingying ;
Zhang, Lei ;
Jiang, Yu ;
Fan, Songhua ;
Wu, Shengxi ;
Fan, Hong ;
Kuang, Fang ;
Zhang, Guilian .
CELL DEATH & DISEASE, 2023, 14 (09)
[34]   COL1A1: A novel oncogenic gene and therapeutic target in malignancies [J].
Li, Xue ;
Sun, Xiaodong ;
Kan, Chengxia ;
Chen, Bing ;
Qu, Na ;
Hou, Ningning ;
Liu, Yongping ;
Han, Fang .
PATHOLOGY RESEARCH AND PRACTICE, 2022, 236
[35]   Deep generative modeling for single-cell transcriptomics [J].
Lopez, Romain ;
Regier, Jeffrey ;
Cole, Michael B. ;
Jordan, Michael I. ;
Yosef, Nir .
NATURE METHODS, 2018, 15 (12) :1053-+
[36]   Current best practices in single-cell RNA-seq analysis: a tutorial [J].
Luecken, Malte D. ;
Theis, Fabian J. .
MOLECULAR SYSTEMS BIOLOGY, 2019, 15 (06)
[37]  
McInnes L, 2020, Arxiv, DOI [arXiv:1802.03426, 10.21105/joss.00861]
[38]   Characterization of a class of sigmoid functions with applications to neural networks [J].
Menon, A ;
Mehrotra, K ;
Mohan, CK ;
Ranka, S .
NEURAL NETWORKS, 1996, 9 (05) :819-835
[39]  
Kipf TN, 2017, Arxiv, DOI [arXiv:1609.02907, DOI 10.48550/ARXIV.1609.02907]
[40]  
Nair V., 2010, PROC 27 INT C MACH L