IBRAP: integrated benchmarking single-cell RNA-sequencing analytical pipeline

被引:4
作者
Knight, Connor H. [1 ]
Khan, Faraz [1 ]
Patel, Ankit [1 ]
Gill, Upkar S. [1 ]
Okosun, Jessica [1 ]
Wang, Jun [1 ]
机构
[1] Queen Mary Univ London, Barts Canc Inst, Ctr Canc Genom & Computat Biol, London EC1M 6BQ, England
关键词
single-cell RNA-seq; analytical pipeline; benchmarking; data integration; cell annotation; ATLAS;
D O I
10.1093/bib/bbad061
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) is a powerful tool to study cellular heterogeneity. The high dimensional data generated from this technology are complex and require specialized expertise for analysis and interpretation. The core of scRNA-seq data analysis contains several key analytical steps, which include pre-processing, quality control, normalization, dimensionality reduction, integration and clustering. Each step often has many algorithms developed with varied underlying assumptions and implications. With such a diverse choice of tools available, benchmarking analyses have compared their performances and demonstrated that tools operate differentially according to the data types and complexity. Here, we present Integrated Benchmarking scRNA-seq Analytical Pipeline (IBRAP), which contains a suite of analytical components that can be interchanged throughout the pipeline alongside multiple benchmarking metrics that enable users to compare results and determine the optimal pipeline combinations for their data. We apply IBRAP to single- and multi-sample integration analysis using primary pancreatic tissue, cancer cell line and simulated data accompanied with ground truth cell labels, demonstrating the interchangeable and benchmarking functionality of IBRAP. Our results confirm that the optimal pipelines are dependent on individual samples and studies, further supporting the rationale and necessity of our tool. We then compare reference-based cell annotation with unsupervised analysis, both included in IBRAP, and demonstrate the superiority of the reference-based method in identifying robust major and minor cell types. Thus, IBRAP presents a valuable tool to integrate multiple samples and studies to create reference maps of normal and diseased tissues, facilitating novel biological discovery using the vast volume of scRNA-seq data available.
引用
收藏
页数:13
相关论文
共 36 条
  • [1] destiny: diffusion maps for large-scale single cell data in R
    Angerer, Philipp
    Haghverdi, Laleh
    Buettner, Maren
    Theis, Fabian J.
    Marr, Carsten
    Buettner, Florian
    [J]. BIOINFORMATICS, 2016, 32 (08) : 1241 - 1243
  • [2] Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage
    Aran, Dvir
    Looney, Agnieszka P.
    Liu, Leqian
    Wu, Esther
    Fong, Valerie
    Hsu, Austin
    Chak, Suzanna
    Naikawadi, Ram P.
    Wolters, Paul J.
    Abate, Adam R.
    Butte, Atul J.
    Bhattacharya, Mallar
    [J]. NATURE IMMUNOLOGY, 2019, 20 (02) : 163 - +
  • [3] Deciphering cell-cell interactions and communication from gene expression
    Armingol, Erick
    Officer, Adam
    Harismendy, Olivier
    Lewis, Nathan E.
    [J]. NATURE REVIEWS GENETICS, 2021, 22 (02) : 71 - 88
  • [4] A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure
    Baron, Maayan
    Veres, Adrian
    Wolock, Samuel L.
    Faust, Aubrey L.
    Gaujoux, Renaud
    Vetere, Amedeo
    Ryu, Jennifer Hyoje
    Wagner, Bridget K.
    Shen-Orr, Shai S.
    Klein, Allon M.
    Melton, Douglas A.
    Yanai, Itai
    [J]. CELL SYSTEMS, 2016, 3 (04) : 346 - +
  • [5] Blighe Kevin., 2022, PCAtools: PCAtools: Everything principal components analysis. R Package Version 2.10
  • [6] Cross-tissue immune cell analysis reveals tissue-specific features in humans
    Conde, C. Dominguez
    Xu, C.
    Jarvis, L. B.
    Rainbow, D. B.
    Wells, S. B.
    Gomes, T.
    Howlett, S. K.
    Suchanek, O.
    Polanski, K.
    King, H. W.
    Mamanova, L.
    Huang, N.
    Szabo, P. A.
    Richardson, L.
    Bolt, L.
    Fasouli, E. S.
    Mahbubani, K. T.
    Prete, M.
    Tuck, L.
    Richoz, N.
    Tuong, Z. K.
    Campos, L.
    Mousa, H. S.
    Needham, E. J.
    Pritchard, S.
    Li, T.
    Elmentaite, R.
    Park, J.
    Rahmani, E.
    Chen, D.
    Menon, D. K.
    Bayraktar, O. A.
    James, L. K.
    Meyer, K. B.
    Yosef, N.
    Clatworthy, M. R.
    Sims, P. A.
    Farber, D. L.
    Saeb-Parsy, K.
    Jones, J. L.
    Teichmann, S. A.
    [J]. SCIENCE, 2022, 376 (6594) : 713 - +
  • [7] ASAP: a web-based platform for the analysis and interactive visualization of single-cell RNA-seq data
    Gardeux, Vincent
    David, Fabrice P. A.
    Shajkofci, Adrian
    Schwalie, Petra C.
    Deplancke, Bart
    [J]. BIOINFORMATICS, 2017, 33 (19) : 3123 - 3125
  • [8] pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools
    Germain, Pierre-Luc
    Sonrel, Anthony
    Robinson, Mark D.
    [J]. GENOME BIOLOGY, 2020, 21 (01)
  • [9] Integrated analysis of multimodal single-cell data
    Hao, Yuhan
    Hao, Stephanie
    Andersen-Nissen, Erica
    Mauck, William M. I. I. I. I. I. I.
    Zheng, Shiwei
    Butler, Andrew
    Lee, Maddie J.
    Wilk, Aaron J.
    Darby, Charlotte
    Zager, Michael
    Hoffman, Paul
    Stoeckius, Marlon
    Papalexi, Efthymia
    Mimitou, Eleni P.
    Jain, Jaison
    Srivastava, Avi
    Stuart, Tim
    Fleming, Lamar M.
    Yeung, Bertrand
    Rogers, Angela J.
    McElrath, Juliana M.
    Blish, Catherine A.
    Gottardo, Raphael
    Smibert, Peter
    Satija, Rahul
    [J]. CELL, 2021, 184 (13) : 3573 - +
  • [10] Efficient integration of heterogeneous single-cell transcriptomes using Scanorama
    Hie, Brian
    Bryson, Bryan
    Berger, Bonnie
    [J]. NATURE BIOTECHNOLOGY, 2019, 37 (06) : 685 - +