pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools

被引:66
作者
Germain, Pierre-Luc [1 ,2 ,3 ]
Sonrel, Anthony [1 ,2 ]
Robinson, Mark D. [1 ,2 ]
机构
[1] Univ Zurich, Dept Mol Life Sci, Winterthurerstr 190, CH-8057 Zurich, Switzerland
[2] SIB Swiss Inst Bioinformat, Zurich, Switzerland
[3] Swiss Fed Inst Technol, D HEST Inst Neurosci, Winterthurerstr 190, CH-8057 Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
Single-cell RNA sequencing (scRNAseq); Pipeline; Clustering; Normalization; Filtering; Benchmark; EXPRESSION; NORMALIZATION; VARIABILITY;
D O I
10.1186/s13059-020-02136-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We presentpipeComp(), a flexible R framework for pipeline comparison handling interactions between analysis steps and relying on multi-level evaluation metrics. We apply it to the benchmark of single-cell RNA-sequencing analysis pipelines using simulated and real datasets with known cell identities, covering common methods of filtering, doublet detection, normalization, feature selection, denoising, dimensionality reduction, and clustering.pipeCompcan easily integrate any other step, tool, or evaluation metric, allowing extensible benchmarks and easy applications to other fields, as we demonstrate through a study of the impact of removal of unwanted variation on differential expression analysis.
引用
收藏
页数:28
相关论文
共 67 条
[1]  
Albergante L., 2019, ESTIMATING EFFECTIVE, P1, DOI [DOI 10.1109/IJCNN.2019.8852450, 10.1109/IJCNN.2019.8852450.]
[2]   A Random Matrix Theory Approach to Denoise Single-Cell Data [J].
Aparicio, Luis ;
Bordyuh, Mykola ;
Blumberg, Andrew J. ;
Rabadan, Raul .
PATTERNS, 2020, 1 (03)
[3]   SCnorm: robust normalization of single-cell RNA-seq data [J].
Bacher, Rhonda ;
Chu, Li-Fang ;
Leng, Ning ;
Gasch, Audrey P. ;
Thomson, James A. ;
Stewart, Ron M. ;
Newton, Michael ;
Kendziorski, Christina .
NATURE METHODS, 2017, 14 (06) :584-+
[4]   scds: computational annotation of doublets in single-cell RNA sequencing data [J].
Bais, Abha S. ;
Kostka, Dennis .
BIOINFORMATICS, 2020, 36 (04) :1150-1158
[5]  
Batson J, 2019, MOLECULAR CROSS VALI, DOI 10.1101/786269.
[6]   Estimating the frequency of multiplets in single-cell RNA sequencing from cell-mixing experiments [J].
Bloom, Jesse D. .
PEERJ, 2018, 6
[7]   Analysis of Transcriptional Variability in a Large Human iPSC Library Reveals Genetic and Non-genetic Determinants of Heterogeneity [J].
Carcamo-Orive, Ivan ;
Hoffman, Gabriel E. ;
Cundiff, Paige ;
Beckmann, Noam D. ;
D'Souza, Sunita L. ;
Knowles, Joshua W. ;
Patel, Achchhe ;
Papatsenko, Dimitri ;
Abbasi, Fahim ;
Reaven, Gerald M. ;
Whalen, Sean ;
Lee, Philip ;
Shahbazi, Mohammad ;
Henrion, Marc Y. R. ;
Zhu, Kuixi ;
Wang, Sven ;
Roussos, Panos ;
Schadt, Eric E. ;
Pandey, Gaurav ;
Chang, Rui ;
Quertermous, Thomas ;
Lemischka, Ihor .
CELL STEM CELL, 2017, 20 (04) :518-+
[8]   Statistical significance of variables driving systematic variation in high-dimensional data [J].
Chung, Neo Christopher ;
Storey, John D. .
BIOINFORMATICS, 2015, 31 (04) :545-554
[9]  
Cobos FA, 2020, BIORXIV, DOI [10.1101/2020.01.10.897116.T., DOI 10.1101/2020.01.10.897116.T]
[10]   Performance Assessment and Selection of Normalization Procedures for Single-Cell RNA-Seq [J].
Cole, Michael B. ;
Risso, Davide ;
Wagner, Allon ;
DeTomaso, David ;
Ngai, John ;
Purdom, Elizabeth ;
Dudoit, Sandrine ;
Yosef, Nir .
CELL SYSTEMS, 2019, 8 (04) :315-+