Deconvolution from bulk gene expression by leveraging sample-wise and gene-wise similarities and single-cell RNA-Seq data

被引:1
作者
Wang, Chenqi [1 ]
Lin, Yifan [1 ]
Li, Shuchao [1 ]
Guan, Jinting [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen, Peoples R China
[2] Minist Educ, Key Lab Syst Control & Informat Proc, Shanghai, Peoples R China
[3] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Xiamen, Peoples R China
来源
BMC GENOMICS | 2024年 / 25卷 / 01期
关键词
Deconvolution; Cell type abundance; Cell type-specific gene expression profile; Similarity matrix; Single-cell RNA-seq data; MOUSE; MAP; NORMALIZATION; HETEROGENEITY; DIVERSITY; ATLAS; STEM;
D O I
10.1186/s12864-024-10728-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundThe widely adopted bulk RNA-seq measures the gene expression average of cells, masking cell type heterogeneity, which confounds downstream analyses. Therefore, identifying the cellular composition and cell type-specific gene expression profiles (GEPs) facilitates the study of the underlying mechanisms of various biological processes. Although single-cell RNA-seq focuses on cell type heterogeneity in gene expression, it requires specialized and expensive resources and currently is not practical for a large number of samples or a routine clinical setting. Recently, computational deconvolution methodologies have been developed, while many of them only estimate cell type composition or cell type-specific GEPs by requiring the other as input. The development of more accurate deconvolution methods to infer cell type abundance and cell type-specific GEPs is still essential.ResultsWe propose a new deconvolution algorithm, DSSC, which infers cell type-specific gene expression and cell type proportions of heterogeneous samples simultaneously by leveraging gene-gene and sample-sample similarities in bulk expression and single-cell RNA-seq data. Through comparisons with the other existing methods, we demonstrate that DSSC is effective in inferring both cell type proportions and cell type-specific GEPs across simulated pseudo-bulk data (including intra-dataset and inter-dataset simulations) and experimental bulk data (including mixture data and real experimental data). DSSC shows robustness to the change of marker gene number and sample size and also has cost and time efficiencies.ConclusionsDSSC provides a practical and promising alternative to the experimental techniques to characterize cellular composition and heterogeneity in the gene expression of heterogeneous samples.
引用
收藏
页数:19
相关论文
共 61 条
  • [1] Deconvolution of Blood Microarray Data Identifies Cellular Activation Patterns in Systemic Lupus Erythematosus
    Abbas, Alexander R.
    Wolslegel, Kristen
    Seshasayee, Dhaya
    Modrusan, Zora
    Clark, Hilary F.
    [J]. PLOS ONE, 2009, 4 (07):
  • [2] Digital cell quantification identifies global immune cell dynamics during influenza infection
    Altboum, Zeev
    Steuerman, Yael
    David, Eyal
    Barnett-Itzhaki, Zohar
    Valadarsky, Liran
    Keren-Shaul, Hadas
    Meningher, Tal
    Mendelson, Ella
    Mandelboim, Michal
    Gat-Viks, Irit
    Amit, Ido
    [J]. MOLECULAR SYSTEMS BIOLOGY, 2014, 10 (02)
  • [3] Bayesian log-normal deconvolution for enhanced in silico microdissection of bulk gene expression data
    Barbosa, Barbara Andrade
    van Asten, Saskia D.
    Oh, Ji Won
    Farina-Sarasqueta, Arantza
    Verheij, Joanne
    Dijk, Frederike
    van Laarhoven, Hanneke W. M.
    Ylstra, Bauke
    Vallejo, Juan J. Garcia
    van de Wiel, Mark A.
    Kim, Yongsoo
    [J]. NATURE COMMUNICATIONS, 2021, 12 (01)
  • [4] A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure
    Baron, Maayan
    Veres, Adrian
    Wolock, Samuel L.
    Faust, Aubrey L.
    Gaujoux, Renaud
    Vetere, Amedeo
    Ryu, Jennifer Hyoje
    Wagner, Bridget K.
    Shen-Orr, Shai S.
    Klein, Allon M.
    Melton, Douglas A.
    Yanai, Itai
    [J]. CELL SYSTEMS, 2016, 3 (04) : 346 - +
  • [5] Basu S, 2010, J Vis Exp, V41, P1546
  • [6] The Rush Memory and Aging Project: Study design and baseline characteristics of the study cohort
    Bennett, DA
    Schneider, JA
    Buchman, AS
    de Leon, CM
    Bienias, JL
    Wilson, RS
    [J]. NEUROEPIDEMIOLOGY, 2005, 25 (04) : 163 - 175
  • [7] DeCompress: tissue compartment deconvolution of targeted mRNA expression panels using compressed sensing
    Bhattacharya, Arjun
    Hamilton, Alina M.
    Troester, Melissa A.
    Love, Michael, I
    [J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (08) : E48
  • [8] A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
    Bolstad, BM
    Irizarry, RA
    Åstrand, M
    Speed, TP
    [J]. BIOINFORMATICS, 2003, 19 (02) : 185 - 193
  • [9] Human cerebral organoids recapitulate gene expression programs of fetal neocortex development
    Camp, J. Gray
    Badsha, Farhath
    Florio, Marta
    Kanton, Sabina
    Gerber, Tobias
    Wilsch-Braeuninger, Michaela
    Lewitus, Eric
    Sykes, Alex
    Hevers, Wulf
    Lancaster, Madeline
    Knoblich, Juergen A.
    Lachmann, Robert
    Paeaebo, Svante
    Huttner, Wieland B.
    Treutlein, Barbara
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (51) : 15672 - 15677
  • [10] Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology
    Chu, Tinyi
    Wang, Zhong
    Pe'er, Dana
    Danko, Charles G.
    [J]. NATURE CANCER, 2022, 3 (04) : 505 - +