共 52 条
SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references
被引:141
作者:
Dong, Meichen
[1
]
Thennavan, Aatish
[2
]
Urrutia, Eugene
[1
]
Li, Yun
[1
,3
,4
]
Perou, Charles M.
[5
,6
]
Zou, Fei
[1
,3
]
Jiang, Yuchao
[1
,3
]
机构:
[1] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Curriculum Bioinformat & Computat Biol, Chapel Hill, NC 27599 USA
[3] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[4] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC 27599 USA
[5] Univ N Carolina, Mol Oncol, Chapel Hill, NC 27599 USA
[6] Univ N Carolina, Computat Med Program, Chapel Hill, NC 27599 USA
基金:
美国国家卫生研究院;
关键词:
single-cell RNA sequencing;
bulk RNA sequencing;
gene expression deconvolution;
ENSEMBLE;
batch effect;
HUMAN PANCREATIC-ISLETS;
STATISTICAL FRAMEWORK;
TECHNOLOGIES;
ARCHITECTURE;
REVEALS;
CANCER;
ATLAS;
D O I:
10.1093/bib/bbz166
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Recent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.
引用
收藏
页码:416 / 427
页数:12
相关论文