Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data

被引:1239
作者
Wolock, Samuel L. [1 ]
Lopez, Romain [1 ,2 ,3 ]
Klein, Allon M. [1 ]
机构
[1] Harvard Med Sch, Dept Syst Biol, Boston, MA 02115 USA
[2] Ecole Polytech, Ctr Math Appl, F-91120 Palaiseau, France
[3] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
关键词
SEQ;
D O I
10.1016/j.cels.2018.11.005
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Single-cell RNA-sequencing has become a widely used, powerful approach for studying cell populations. However, these methods often generate multiplet artifacts, where two or more cells receive the same barcode, resulting in a hybrid transcriptome. In most experiments, multiplets account for several percent of transcriptomes and can confound downstream data analysis. Here, we present Single-Cell Remover of Doublets (Scrublet), a framework for predicting the impact of multiplets in a given analysis and identifying problematic multiplets. Scrublet avoids the need for expert knowledge or cell clustering by simulating multiplets from the data and building a nearest neighbor classifier. To demonstrate the utility of this approach, we test Scrublet on several datasets that include independent knowledge of cell multiplets. Scrublet is freely available for download at github.com/AllonKleinLab/scrublet.
引用
收藏
页码:281 / +
页数:20
相关论文
共 30 条
[1]   A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response [J].
Adamson, Britt ;
Norman, Thomas M. ;
Jost, Marco ;
Cho, Min Y. ;
Nunez, James K. ;
Chen, Yuwen ;
Villalta, Jacqueline E. ;
Gilbert, Luke A. ;
Horlbeck, Max A. ;
Hein, Marco Y. ;
Pak, Ryan A. ;
Gray, Andrew N. ;
Gross, Carol A. ;
Dixit, Atray ;
Parnas, Oren ;
Regev, Aviv ;
Weissman, Jonathan S. .
CELL, 2016, 167 (07) :1867-+
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]  
Bernhardsson E., 2013, ANNOY: Approximate nearest neighbors in C++/Python optimized for memory usage and loading/saving to disk
[4]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[5]  
Datlinger P, 2017, NAT METHODS, V14, P297, DOI [10.1038/NMETH.4177, 10.1038/nmeth.4177]
[6]  
Elias JE, 2010, METHODS MOL BIOL, V604, P55, DOI 10.1007/978-1-60761-444-9_5
[7]  
Gehring J., 2018, BIORXIV
[8]   Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput [J].
Gierahn, Todd M. ;
Wadsworth, Marc H., II ;
Hughes, Travis K. ;
Bryson, Bryan D. ;
Butler, Andrew ;
Satija, Rahul ;
Fortune, Sarah ;
Love, J. Christopher ;
Shalek, Alex K. .
NATURE METHODS, 2017, 14 (04) :395-+
[9]   Using single-cell genomics to understand developmental processes and cell fate decisions [J].
Griffiths, Jonathan A. ;
Scialdone, Antonio ;
Marioni, John C. .
MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (04)
[10]   De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data [J].
Grun, Dominic ;
Muraro, Mauro J. ;
Boisset, Jean-Charles ;
Wiebrands, Kay ;
Lyubimova, Anna ;
Dharmadhikari, Gitanjali ;
van den Born, Maaike ;
van Es, Johan ;
Jansen, Erik ;
Clevers, Hans ;
de Koning, Eelco J. P. ;
van Oudenaarden, Alexander .
CELL STEM CELL, 2016, 19 (02) :266-277