Deconvolution from bulk gene expression by leveraging sample-wise and gene-wise similarities and single-cell RNA-Seq data

被引:1
|
作者
Wang, Chenqi [1 ]
Lin, Yifan [1 ]
Li, Shuchao [1 ]
Guan, Jinting [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen, Peoples R China
[2] Minist Educ, Key Lab Syst Control & Informat Proc, Shanghai, Peoples R China
[3] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Xiamen, Peoples R China
来源
BMC GENOMICS | 2024年 / 25卷 / 01期
关键词
Deconvolution; Cell type abundance; Cell type-specific gene expression profile; Similarity matrix; Single-cell RNA-seq data; MOUSE; MAP; NORMALIZATION; HETEROGENEITY; DIVERSITY; ATLAS; STEM;
D O I
10.1186/s12864-024-10728-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundThe widely adopted bulk RNA-seq measures the gene expression average of cells, masking cell type heterogeneity, which confounds downstream analyses. Therefore, identifying the cellular composition and cell type-specific gene expression profiles (GEPs) facilitates the study of the underlying mechanisms of various biological processes. Although single-cell RNA-seq focuses on cell type heterogeneity in gene expression, it requires specialized and expensive resources and currently is not practical for a large number of samples or a routine clinical setting. Recently, computational deconvolution methodologies have been developed, while many of them only estimate cell type composition or cell type-specific GEPs by requiring the other as input. The development of more accurate deconvolution methods to infer cell type abundance and cell type-specific GEPs is still essential.ResultsWe propose a new deconvolution algorithm, DSSC, which infers cell type-specific gene expression and cell type proportions of heterogeneous samples simultaneously by leveraging gene-gene and sample-sample similarities in bulk expression and single-cell RNA-seq data. Through comparisons with the other existing methods, we demonstrate that DSSC is effective in inferring both cell type proportions and cell type-specific GEPs across simulated pseudo-bulk data (including intra-dataset and inter-dataset simulations) and experimental bulk data (including mixture data and real experimental data). DSSC shows robustness to the change of marker gene number and sample size and also has cost and time efficiencies.ConclusionsDSSC provides a practical and promising alternative to the experimental techniques to characterize cellular composition and heterogeneity in the gene expression of heterogeneous samples.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Identifying progressive gene network perturbation from single-cell RNA-seq data
    Mukherjee, Sumit
    Carignano, Alberto
    Seelig, Georg
    Lee, Su-In
    2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 5034 - 5040
  • [22] Underrepresentation of activating KIR gene expression in single-cell RNA-seq data is due to KIR gene misassignment
    Alves, Eric
    Chopra, Abha
    Ram, Ramesh
    Currenti, Jennifer
    Kalams, Spyros A.
    Mallal, Simon A.
    Phillips, Elizabeth J.
    Gaudieri, Silvana
    EUROPEAN JOURNAL OF IMMUNOLOGY, 2024, 54 (01)
  • [23] Helenus: A machine learning algorithm for cancer cell (CC) gene expression deconvolution from bulk RNA-seq
    Beliaeva, Valentina
    Ivleva, Ekaterina
    Shpak, Boris
    Litvinov, Danil
    Zotova, Anastasia
    Nomie, Krystle
    Polyakova, Zlata
    Dyikanov, Daniiar
    Kuznetsov, Alexander
    Savchenko, Maria
    Zaitsev, Aleksandr
    Fowler, Nathan
    Bagaev, Alexander
    CANCER SCIENCE, 2024, 115 : 1444 - 1444
  • [24] Characteristics of allelic gene expression in human brain cells from single-cell RNA-seq data analysis
    Dejian Zhao
    Mingyan Lin
    Erika Pedrosa
    Herbert M. Lachman
    Deyou Zheng
    BMC Genomics, 18
  • [25] From local to global gene co-expression estimation using single-cell RNA-seq data
    Tian, Jinjin
    Lei, Jing
    Roeder, Kathryn
    BIOMETRICS, 2024, 80 (01)
  • [26] Characteristics of allelic gene expression in human brain cells from single-cell RNA-seq data analysis
    Zhao, Dejian
    Lin, Mingyan
    Pedrosa, Erika
    Lachman, Herbert M.
    Zheng, Deyou
    BMC GENOMICS, 2017, 18
  • [27] Comparison of Gene Selection Methods for Clustering Single-cell RNA-seq Data
    Zhu, Xiaoshu
    Wang, Jianxin
    Li, Rongruan
    Peng, Xiaoqing
    CURRENT BIOINFORMATICS, 2023, 18 (01) : 1 - 11
  • [28] FastCount: A Fast Gene Count Software for Single-cell RNA-seq Data
    Liu, Jinpeng
    Liu, Xinan
    Yu, Ye
    Wang, Chi
    Liu, Jinze
    12TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS (ACM-BCB 2021), 2021,
  • [29] Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data
    Yip, Shun H.
    Sham, Pak Chung
    Wang, Junwen
    BRIEFINGS IN BIOINFORMATICS, 2019, 20 (04) : 1583 - 1589
  • [30] Identifying gene expression programs in single-cell RNA-seq data using linear correlation explanation
    Nussbaum, Yulia I.
    Hossain, K. S. M. Tozammel
    Kaifi, Jussuf
    Warren, Wesley C.
    Shyu, Chi-Ren
    Mitchem, Jonathan B.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 154