SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references

被引:141
作者
Dong, Meichen [1 ]
Thennavan, Aatish [2 ]
Urrutia, Eugene [1 ]
Li, Yun [1 ,3 ,4 ]
Perou, Charles M. [5 ,6 ]
Zou, Fei [1 ,3 ]
Jiang, Yuchao [1 ,3 ]
机构
[1] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Curriculum Bioinformat & Computat Biol, Chapel Hill, NC 27599 USA
[3] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[4] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC 27599 USA
[5] Univ N Carolina, Mol Oncol, Chapel Hill, NC 27599 USA
[6] Univ N Carolina, Computat Med Program, Chapel Hill, NC 27599 USA
基金
美国国家卫生研究院;
关键词
single-cell RNA sequencing; bulk RNA sequencing; gene expression deconvolution; ENSEMBLE; batch effect; HUMAN PANCREATIC-ISLETS; STATISTICAL FRAMEWORK; TECHNOLOGIES; ARCHITECTURE; REVEALS; CANCER; ATLAS;
D O I
10.1093/bib/bbz166
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.
引用
收藏
页码:416 / 427
页数:12
相关论文
共 52 条
[1]   A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure [J].
Baron, Maayan ;
Veres, Adrian ;
Wolock, Samuel L. ;
Faust, Aubrey L. ;
Gaujoux, Renaud ;
Vetere, Amedeo ;
Ryu, Jennifer Hyoje ;
Wagner, Bridget K. ;
Shen-Orr, Shai S. ;
Klein, Allon M. ;
Melton, Douglas A. ;
Yanai, Itai .
CELL SYSTEMS, 2016, 3 (04) :346-+
[2]   Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression [J].
Becht, Etienne ;
Giraldo, Nicolas A. ;
Lacroix, Laetitia ;
Buttard, Benedicte ;
Elarouci, Nabila ;
Petitprez, Florent ;
Selves, Janick ;
Laurent-Puig, Pierre ;
Sautes-Fridman, Catherine ;
Fridman, Wolf H. ;
de Reynies, Aurelien .
GENOME BIOLOGY, 2016, 17
[3]   Assessment of human pancreatic islet architecture and composition by laser scanning confocal microscopy [J].
Brissova, M ;
Fowler, MJ ;
Nicholson, WE ;
Chu, A ;
Hirshberg, B ;
Harlan, DM ;
Powers, AC .
JOURNAL OF HISTOCHEMISTRY & CYTOCHEMISTRY, 2005, 53 (09) :1087-1097
[4]   Integrating single-cell transcriptomic data across different conditions, technologies, and species [J].
Butler, Andrew ;
Hoffman, Paul ;
Smibert, Peter ;
Papalexi, Efthymia ;
Satija, Rahul .
NATURE BIOTECHNOLOGY, 2018, 36 (05) :411-+
[5]   The unique cytoarchitecture of human pancreatic islets has implications for islet cell function [J].
Cabrera, O ;
Berman, DM ;
Kenyon, NS ;
Ricordi, C ;
Berggrern, PO ;
Caicedo, A .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (07) :2334-2339
[6]   Computational deconvolution of transcriptomics data from mixed cell populations [J].
Cobos, Francisco Avila ;
Vandesompele, Jo ;
Mestdagh, Pieter ;
De Preter, Katleen .
BIOINFORMATICS, 2018, 34 (11) :1969-1979
[7]   Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning [J].
Deng, Yue ;
Bao, Feng ;
Dai, Qionghai ;
Wu, Lani F. ;
Altschuler, Steven J. .
NATURE METHODS, 2019, 16 (04) :311-+
[8]   Meta-analysis in clinical trials revisited [J].
DerSimonian, Rebecca ;
Laird, Nan .
CONTEMPORARY CLINICAL TRIALS, 2015, 45 :139-145
[9]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[10]   Gene Expression Omnibus: NCBI gene expression and hybridization array data repository [J].
Edgar, R ;
Domrachev, M ;
Lash, AE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :207-210