IBRAP: integrated benchmarking single-cell RNA-sequencing analytical pipeline

被引:4
作者
Knight, Connor H. [1 ]
Khan, Faraz [1 ]
Patel, Ankit [1 ]
Gill, Upkar S. [1 ]
Okosun, Jessica [1 ]
Wang, Jun [1 ]
机构
[1] Queen Mary Univ London, Barts Canc Inst, Ctr Canc Genom & Computat Biol, London EC1M 6BQ, England
关键词
single-cell RNA-seq; analytical pipeline; benchmarking; data integration; cell annotation; ATLAS;
D O I
10.1093/bib/bbad061
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) is a powerful tool to study cellular heterogeneity. The high dimensional data generated from this technology are complex and require specialized expertise for analysis and interpretation. The core of scRNA-seq data analysis contains several key analytical steps, which include pre-processing, quality control, normalization, dimensionality reduction, integration and clustering. Each step often has many algorithms developed with varied underlying assumptions and implications. With such a diverse choice of tools available, benchmarking analyses have compared their performances and demonstrated that tools operate differentially according to the data types and complexity. Here, we present Integrated Benchmarking scRNA-seq Analytical Pipeline (IBRAP), which contains a suite of analytical components that can be interchanged throughout the pipeline alongside multiple benchmarking metrics that enable users to compare results and determine the optimal pipeline combinations for their data. We apply IBRAP to single- and multi-sample integration analysis using primary pancreatic tissue, cancer cell line and simulated data accompanied with ground truth cell labels, demonstrating the interchangeable and benchmarking functionality of IBRAP. Our results confirm that the optimal pipelines are dependent on individual samples and studies, further supporting the rationale and necessity of our tool. We then compare reference-based cell annotation with unsupervised analysis, both included in IBRAP, and demonstrate the superiority of the reference-based method in identifying robust major and minor cell types. Thus, IBRAP presents a valuable tool to integrate multiple samples and studies to create reference maps of normal and diseased tissues, facilitating novel biological discovery using the vast volume of scRNA-seq data available.
引用
收藏
页数:13
相关论文
共 36 条
[1]   destiny: diffusion maps for large-scale single cell data in R [J].
Angerer, Philipp ;
Haghverdi, Laleh ;
Buettner, Maren ;
Theis, Fabian J. ;
Marr, Carsten ;
Buettner, Florian .
BIOINFORMATICS, 2016, 32 (08) :1241-1243
[2]   Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage [J].
Aran, Dvir ;
Looney, Agnieszka P. ;
Liu, Leqian ;
Wu, Esther ;
Fong, Valerie ;
Hsu, Austin ;
Chak, Suzanna ;
Naikawadi, Ram P. ;
Wolters, Paul J. ;
Abate, Adam R. ;
Butte, Atul J. ;
Bhattacharya, Mallar .
NATURE IMMUNOLOGY, 2019, 20 (02) :163-+
[3]   Deciphering cell-cell interactions and communication from gene expression [J].
Armingol, Erick ;
Officer, Adam ;
Harismendy, Olivier ;
Lewis, Nathan E. .
NATURE REVIEWS GENETICS, 2021, 22 (02) :71-88
[4]   A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure [J].
Baron, Maayan ;
Veres, Adrian ;
Wolock, Samuel L. ;
Faust, Aubrey L. ;
Gaujoux, Renaud ;
Vetere, Amedeo ;
Ryu, Jennifer Hyoje ;
Wagner, Bridget K. ;
Shen-Orr, Shai S. ;
Klein, Allon M. ;
Melton, Douglas A. ;
Yanai, Itai .
CELL SYSTEMS, 2016, 3 (04) :346-+
[5]  
Blighe Kevin., 2022, PCAtools: PCAtools: Everything principal components analysis
[6]   Cross-tissue immune cell analysis reveals tissue-specific features in humans [J].
Conde, C. Dominguez ;
Xu, C. ;
Jarvis, L. B. ;
Rainbow, D. B. ;
Wells, S. B. ;
Gomes, T. ;
Howlett, S. K. ;
Suchanek, O. ;
Polanski, K. ;
King, H. W. ;
Mamanova, L. ;
Huang, N. ;
Szabo, P. A. ;
Richardson, L. ;
Bolt, L. ;
Fasouli, E. S. ;
Mahbubani, K. T. ;
Prete, M. ;
Tuck, L. ;
Richoz, N. ;
Tuong, Z. K. ;
Campos, L. ;
Mousa, H. S. ;
Needham, E. J. ;
Pritchard, S. ;
Li, T. ;
Elmentaite, R. ;
Park, J. ;
Rahmani, E. ;
Chen, D. ;
Menon, D. K. ;
Bayraktar, O. A. ;
James, L. K. ;
Meyer, K. B. ;
Yosef, N. ;
Clatworthy, M. R. ;
Sims, P. A. ;
Farber, D. L. ;
Saeb-Parsy, K. ;
Jones, J. L. ;
Teichmann, S. A. .
SCIENCE, 2022, 376 (6594) :713-+
[7]   ASAP: a web-based platform for the analysis and interactive visualization of single-cell RNA-seq data [J].
Gardeux, Vincent ;
David, Fabrice P. A. ;
Shajkofci, Adrian ;
Schwalie, Petra C. ;
Deplancke, Bart .
BIOINFORMATICS, 2017, 33 (19) :3123-3125
[8]   pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools [J].
Germain, Pierre-Luc ;
Sonrel, Anthony ;
Robinson, Mark D. .
GENOME BIOLOGY, 2020, 21 (01)
[9]   Integrated analysis of multimodal single-cell data [J].
Hao, Yuhan ;
Hao, Stephanie ;
Andersen-Nissen, Erica ;
Mauck, William M. I. I. I. I. I. I. ;
Zheng, Shiwei ;
Butler, Andrew ;
Lee, Maddie J. ;
Wilk, Aaron J. ;
Darby, Charlotte ;
Zager, Michael ;
Hoffman, Paul ;
Stoeckius, Marlon ;
Papalexi, Efthymia ;
Mimitou, Eleni P. ;
Jain, Jaison ;
Srivastava, Avi ;
Stuart, Tim ;
Fleming, Lamar M. ;
Yeung, Bertrand ;
Rogers, Angela J. ;
McElrath, Juliana M. ;
Blish, Catherine A. ;
Gottardo, Raphael ;
Smibert, Peter ;
Satija, Rahul .
CELL, 2021, 184 (13) :3573-+
[10]   Efficient integration of heterogeneous single-cell transcriptomes using Scanorama [J].
Hie, Brian ;
Bryson, Bryan ;
Berger, Bonnie .
NATURE BIOTECHNOLOGY, 2019, 37 (06) :685-+