Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender

被引:198
作者
Fleming, Stephen J. [1 ,2 ]
Chaffin, Mark D. [2 ,3 ]
Arduini, Alessandro [2 ,8 ]
Akkad, Amer-Denis [4 ]
Banks, Eric [1 ]
Marioni, John C. [5 ,6 ]
Philippakis, Anthony A. [1 ]
Ellinor, Patrick T. [2 ,3 ,7 ]
Babadi, Mehrtash [1 ,2 ]
机构
[1] Broad Inst MIT & Harvard, Data Sci Platform, Cambridge, MA 02142 USA
[2] Broad Inst MIT & Harvard, Precis Cardiol Lab PCL, Cambridge, MA 02142 USA
[3] Broad Inst MIT & Harvard, Cardiovasc Dis Initiat, Cambridge, MA USA
[4] Bayer US LLC, Precis Cardiol Lab PCL, Cambridge, MA USA
[5] Wellcome Sanger Inst, Cambridge, England
[6] European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
[7] Massachusetts Gen Hosp, Cardiovasc Res Ctr, Boston, MA USA
[8] Bayer US LLC, Cambridge, MA USA
关键词
RNA;
D O I
10.1038/s41592-023-01943-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Using a deep generative model, CellBender models and denoises droplet-based single-cell data and improves multiple downstream analyses. Droplet-based single-cell assays, including single-cell RNA sequencing (scRNA-seq), single-nucleus RNA sequencing (snRNA-seq) and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), generate considerable background noise counts, the hallmark of which is nonzero counts in cell-free droplets and off-target gene expression in unexpected cell types. Such systematic background noise can lead to batch effects and spurious differential gene expression results. Here we develop a deep generative model based on the phenomenology of noise generation in droplet-based assays. The proposed model accurately distinguishes cell-containing droplets from cell-free droplets, learns the background noise profile and provides noise-free quantification in an end-to-end fashion. We implement this approach in the scalable and robust open-source software package CellBender. Analysis of simulated data demonstrates that CellBender operates near the theoretically optimal denoising limit. Extensive evaluations using real datasets and experimental benchmarks highlight enhanced concordance between droplet-based single-cell data and established gene expression patterns, while the learned background noise profile provides evidence of degraded or uncaptured cell types.
引用
收藏
页码:1323 / +
页数:39
相关论文
共 61 条
[1]  
[Anonymous], 2021, NEUTROPHIL ANAL 10X
[2]  
Bingham E, 2019, J MACH LEARN RES, V20
[3]   Variational Inference: A Review for Statisticians [J].
Blei, David M. ;
Kucukelbir, Alp ;
McAuliffe, Jon D. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877
[4]   Integrating single-cell transcriptomic data across different conditions, technologies, and species [J].
Butler, Andrew ;
Hoffman, Paul ;
Smibert, Peter ;
Papalexi, Efthymia ;
Satija, Rahul .
NATURE BIOTECHNOLOGY, 2018, 36 (05) :411-+
[5]   Neuronal ambient RNA contamination causes misinterpreted and masked cell types in brain single-nuclei datasets [J].
Caglayan, Emre ;
Liu, Yuxiang ;
Konopka, Genevieve .
NEURON, 2022, 110 (24) :4043-+
[6]   Single-nucleus profiling of human dilated and hypertrophic cardiomyopathy [J].
Chaffin, Mark ;
Papangeli, Irinna ;
Simonson, Bridget ;
Akkad, Amer-Denis ;
Hill, Matthew C. ;
Arduini, Alessandro ;
Fleming, Stephen J. ;
Melanson, Michelle ;
Hayat, Sikander ;
Kost-Alimova, Maria ;
Atwa, Ondine ;
Ye, Jiangchuan ;
Bedi, Kenneth C., Jr. ;
Nahrendorf, Matthias ;
Kaushik, Virendar K. ;
Stegmann, Christian M. ;
Margulies, Kenneth B. ;
Tucker, Nathan R. ;
Ellinor, Patrick T. .
NATURE, 2022, 608 (7921) :174-+
[7]   High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell [J].
Chen, Song ;
Lake, Blue B. ;
Zhang, Kun .
NATURE BIOTECHNOLOGY, 2019, 37 (12) :1452-+
[8]   Microfluidics-free single-cell genomics with templated emulsification [J].
Clark, Iain C. ;
Fontanez, Kristina M. ;
Meltzer, Robert H. ;
Xue, Yi ;
Hayford, Corey ;
May-Zhang, Aaron ;
D'Amato, Chris ;
Osman, Ahmad ;
Zhang, Jesse Q. ;
Hettige, Pabodha ;
Ishibashi, Jacob S. A. ;
Delley, Cyrille L. ;
Weisgerber, Daniel W. ;
Replogle, Joseph M. ;
Jost, Marco ;
Phong, Kiet T. ;
Kennedy, Vanessa E. ;
Peretz, Cheryl A. C. ;
Kim, Esther A. ;
Song, Siyou ;
Karlon, William ;
Weissman, Jonathan S. ;
Smith, Catherine C. ;
Gartner, Zev J. ;
Abate, Adam R. .
NATURE BIOTECHNOLOGY, 2023, 41 (11) :1557-+
[9]   Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods [J].
Clarke, Zoe A. ;
Andrews, Tallulah S. ;
Atif, Jawairia ;
Pouyabahar, Delaram ;
Innes, Brendan T. ;
MacParland, Sonya A. ;
Bader, Gary D. .
NATURE PROTOCOLS, 2021, 16 (06) :2749-+
[10]   A cellular and spatial map of the choroid plexus across brain ventricles and ages [J].
Dani, Neil ;
Herbst, Rebecca H. ;
McCabe, Cristin ;
Green, Gilad S. ;
Kaiser, Karol ;
Head, Joshua P. ;
Cui, Jin ;
Shipley, Frederick B. ;
Jang, Ahram ;
Dionne, Danielle ;
Nguyen, Lan ;
Rodman, Christopher ;
Riesenfeld, Samantha J. ;
Prochazka, Jan ;
Prochazkova, Michaela ;
Sedlacek, Radislav ;
Zhang, Feng ;
Bryja, Vitezslav ;
Rozenblatt-Rosen, Orit ;
Habib, Naomi ;
Regev, Aviv ;
Lehtinen, Maria K. .
CELL, 2021, 184 (11) :3056-+