A multicenter study benchmarking single-cell RNA sequencing technologies using reference samples

被引:61
作者
Chen, Wanqiu [1 ]
Zhao, Yongmei [2 ,3 ]
Chen, Xin [1 ,4 ]
Yang, Zhaowei [1 ,5 ]
Xu, Xiaojiang [6 ]
Bi, Yingtao [7 ]
Chen, Vicky [2 ,3 ]
Li, Jing [4 ,5 ]
Choi, Hannah [1 ]
Ernest, Ben [8 ]
Tran, Bao [3 ]
Mehta, Monika [3 ]
Kumar, Parimal [3 ]
Farmer, Andrew [9 ]
Mir, Alain [9 ]
Mehra, Urvashi Ann [8 ]
Li, Jian-Liang [6 ]
Moos, Malcolm, Jr. [10 ,11 ]
Xiao, Wenming [12 ]
Wang, Charles [1 ,4 ]
机构
[1] Loma Linda Univ, Sch Med, Ctr Genom, Loma Linda, CA 92350 USA
[2] Frederick Natl Lab Canc Res, CCR SF Bioinformat Grp, Adv Biomed & Computat Sci Biomed Informat & Data, Frederick, MD USA
[3] Frederick Natl Lab Canc Res, Sequencing Facil, Frederick, MD USA
[4] Loma Linda Univ, Sch Med, Dept Basic Sci, Loma Linda, CA 92354 USA
[5] Guangzhou Med Univ, Guangzhou Inst Resp Hlth, Dept Allergy & Clin Immunol, State Key Lab Resp Dis,Affiliated Hosp 1, Guangzhou, Peoples R China
[6] NIEHS, Integrat Bioinformat Support Grp, POB 12233, Res Triangle Pk, NC 27709 USA
[7] Abbvie Cambridge Res Ctr, Cambridge, MA USA
[8] Digicon Corp, Mclean, VA USA
[9] Takara Bio USA Inc, Mountain View, CA USA
[10] US FDA, Ctr Biol Evaluat & Res, Silver Spring, MD USA
[11] US FDA, Div Cellular & Gene Therapies, Silver Spring, MD USA
[12] US FDA, Ctr Devices & Radiol Hlth, Silver Spring, MD 20993 USA
基金
美国国家卫生研究院;
关键词
SEQ DATA; QUANTIFICATION; NORMALIZATION; HETEROGENEITY;
D O I
10.1038/s41587-020-00748-9
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
A comprehensive comparison of 20 single-cell RNA-seq datasets derived from the two cell lines analyzed using six preprocessing pipelines, eight normalization methods and seven batch-correction algorithms derived from four different sequencing platforms at different centers. Comparing diverse single-cell RNA sequencing (scRNA-seq) datasets generated by different technologies and in different laboratories remains a major challenge. Here we address the need for guidance in choosing algorithms leading to accurate biological interpretations of varied data types acquired with different platforms. Using two well-characterized cellular reference samples (breast cancer cells and B cells), captured either separately or in mixtures, we compared different scRNA-seq platforms and several preprocessing, normalization and batch-effect correction methods at multiple centers. Although preprocessing and normalization contributed to variability in gene detection and cell classification, batch-effect correction was by far the most important factor in correctly classifying the cells. Moreover, scRNA-seq dataset characteristics (for example, sample and cellular heterogeneity and platform used) were critical in determining the optimal bioinformatic method. However, reproducibility across centers and platforms was high when appropriate bioinformatic methods were applied. Our findings offer practical guidance for optimizing platform and software selection when designing an scRNA-seq study.
引用
收藏
页码:1103 / +
页数:29
相关论文
共 55 条
  • [1] Cell fixation and preservation for droplet-based single-cell transcriptomics
    Alles, Jonathan
    Karaiskos, Nikos
    Praktiknjo, Samantha D.
    Grosswendt, Stefanie
    Wahle, Philipp
    Ruffault, Pierre-Louis
    Ayoub, Salah
    Schreyer, Luisa
    Boltengagen, Anastasiya
    Birchmeier, Carmen
    Zinzen, Robert
    Kocks, Christine
    Rajewsky, Nikolaus
    [J]. BMC BIOLOGY, 2017, 15
  • [2] SCnorm: robust normalization of single-cell RNA-seq data
    Bacher, Rhonda
    Chu, Li-Fang
    Leng, Ning
    Gasch, Audrey P.
    Thomson, James A.
    Stewart, Ron M.
    Newton, Michael
    Kendziorski, Christina
    [J]. NATURE METHODS, 2017, 14 (06) : 584 - +
  • [3] Dimensionality reduction for visualizing single-cell data using UMAP
    Becht, Etienne
    McInnes, Leland
    Healy, John
    Dutertre, Charles-Antoine
    Kwok, Immanuel W. H.
    Ng, Lai Guan
    Ginhoux, Florent
    Newell, Evan W.
    [J]. NATURE BIOTECHNOLOGY, 2019, 37 (01) : 38 - +
  • [4] Trimmomatic: a flexible trimmer for Illumina sequence data
    Bolger, Anthony M.
    Lohse, Marc
    Usadel, Bjoern
    [J]. BIOINFORMATICS, 2014, 30 (15) : 2114 - 2120
  • [5] Near-optimal probabilistic RNA-seq quantification (vol 34, pg 525, 2016)
    Bray, Nicolas L.
    Pimentel, Harold
    Melsted, Pall
    Pachter, Lior
    [J]. NATURE BIOTECHNOLOGY, 2016, 34 (08) : 888 - 888
  • [6] Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells
    Buettner, Florian
    Natarajan, Kedar N.
    Casale, F. Paolo
    Proserpio, Valentina
    Scialdone, Antonio
    Theis, Fabian J.
    Teichmann, Sarah A.
    Marioni, John C.
    Stegie, Oliver
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (02) : 155 - 160
  • [7] A test metric for assessing single-cell RNA-seq batch correction
    Buettner, Maren
    Miao, Zhichao
    Wolf, F. Alexander
    Teichmann, Sarah A.
    Theis, Fabian J.
    [J]. NATURE METHODS, 2019, 16 (01) : 43 - +
  • [8] Integrating single-cell transcriptomic data across different conditions, technologies, and species
    Butler, Andrew
    Hoffman, Paul
    Smibert, Peter
    Papalexi, Efthymia
    Satija, Rahul
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (05) : 411 - +
  • [9] Chen X, 2020, MULTICENTER CROSS PL, DOI 10.1101/2020.09.20.305474
  • [10] Performance Assessment and Selection of Normalization Procedures for Single-Cell RNA-Seq
    Cole, Michael B.
    Risso, Davide
    Wagner, Allon
    DeTomaso, David
    Ngai, John
    Purdom, Elizabeth
    Dudoit, Sandrine
    Yosef, Nir
    [J]. CELL SYSTEMS, 2019, 8 (04) : 315 - +