A fast, scalable and versatile tool for analysis of single-cell omics data

被引:30
作者
Zhang, Kai [1 ,6 ]
Zemke, Nathan R. [1 ,2 ]
Armand, Ethan J. [1 ,3 ]
Ren, Bing [1 ,2 ,4 ,5 ]
机构
[1] Univ Calif San Diego, Sch Med, Dept Cellular & Mol Med, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Ctr Epigen, Sch Med, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Bioinformat & Syst Biol Program, La Jolla, CA USA
[4] Ludwig Inst Canc Res, La Jolla, CA 92093 USA
[5] Univ Calif San Diego, Inst Genom Med, La Jolla, CA 92093 USA
[6] Westlake Univ, Sch Life Sci, Westlake Lab Life Sci & Biomed, Hangzhou, Peoples R China
基金
美国国家卫生研究院;
关键词
DIFFUSION MAPS; CHROMATIN;
D O I
10.1038/s41592-023-02139-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Single-cell omics technologies have revolutionized the study of gene regulation in complex tissues. A major computational challenge in analyzing these datasets is to project the large-scale and high-dimensional data into low-dimensional space while retaining the relative relationships between cells. This low dimension embedding is necessary to decompose cellular heterogeneity and reconstruct cell-type-specific gene regulatory programs. Traditional dimensionality reduction techniques, however, face challenges in computational efficiency and in comprehensively addressing cellular diversity across varied molecular modalities. Here we introduce a nonlinear dimensionality reduction algorithm, embodied in the Python package SnapATAC2, which not only achieves a more precise capture of single-cell omics data heterogeneities but also ensures efficient runtime and memory usage, scaling linearly with the number of cells. Our algorithm demonstrates exceptional performance, scalability and versatility across diverse single-cell omics datasets, including single-cell assay for transposase-accessible chromatin using sequencing, single-cell RNA sequencing, single-cell Hi-C and single-cell multi-omics datasets, underscoring its utility in advancing single-cell analysis. SnapATAC2 uses a matrix-free spectral embedding algorithm for nonlinear dimension reduction of single-cell omics data, which shows an improved performance in capturing cellular heterogeneity and scalability for large datasets.
引用
收藏
页码:217 / 227
页数:30
相关论文
共 61 条
[11]  
Chari T., 2021, SPECIOUS ART SINGLE, V19
[12]  
Chen GL, 2018, INT C PATT RECOG, P314, DOI 10.1109/ICPR.2018.8546193
[13]   A Scalable Spectral Clustering Algorithm Based on Landmark-Embedding and Cosine Similarity [J].
Chen, Guangliang .
STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2018, 2018, 11004 :52-62
[14]   Assessment of computational methods for the analysis of single-cell ATAC-seq data [J].
Chen, Huidong ;
Lareau, Caleb A. ;
Andreani, Tommaso ;
Vinyard, Michael E. ;
Garcia, Sara P. ;
Clement, Kendell ;
Andrade-Navarro, Miguel ;
Buenrostro, Jason D. ;
Pinello, Luca .
GENOME BIOLOGY, 2019, 20 (01)
[15]   High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell [J].
Chen, Song ;
Lake, Blue B. ;
Zhang, Kun .
NATURE BIOTECHNOLOGY, 2019, 37 (12) :1452-+
[16]   EpiScanpy: integrated single-cell epigenomic analysis [J].
Danese, Anna ;
Richter, Maria L. ;
Chaichoompu, Kridsadakorn ;
Fischer, David S. ;
Theis, Fabian J. ;
Colome-Tatche, Maria .
NATURE COMMUNICATIONS, 2021, 12 (01)
[17]  
Duo Angelo, 2018, F1000Res, V7, P1141, DOI 10.12688/f1000research.15666.3
[18]   Comprehensive analysis of single cell ATAC-seq data with SnapATAC [J].
Fang, Rongxin ;
Preissl, Sebastian ;
Li, Yang ;
Hou, Xiaomeng ;
Lucero, Jacinta ;
Wang, Xinxin ;
Motamedi, Amir ;
Shiau, Andrew K. ;
Zhou, Xinzhu ;
Xie, Fangming ;
Mukamel, Eran A. ;
Zhang, Kai ;
Zhang, Yanxiao ;
Behrens, M. Margarita ;
Ecker, Joseph R. ;
Ren, Bing .
NATURE COMMUNICATIONS, 2021, 12 (01)
[19]   Spectral grouping using the Nystrom method [J].
Fowlkes, C ;
Belongie, S ;
Chung, F ;
Malik, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (02) :214-225
[20]   A Python']Python library for probabilistic analysis of single-cell omics data [J].
Gayoso, Adam ;
Lopez, Romain ;
Xing, Galen ;
Boyeau, Pierre ;
Amiri, Valeh Valiollah Pour ;
Hong, Justin ;
Wu, Katherine ;
Jayasuriya, Michael ;
Mehlman, Edouard ;
Langevin, Maxime ;
Liu, Yining ;
Samaran, Jules ;
Misrachi, Gabriel ;
Nazaret, Achille ;
Clivio, Oscar ;
Xu, Chenling ;
Ashuach, Tal ;
Gabitto, Mariano ;
Lotfollahi, Mohammad ;
Svensson, Valentine ;
Beltrame, Eduardo da Veiga ;
Kleshchevnikov, Vitalii ;
Talavera-Lopez, Carlos ;
Pachter, Lior ;
Theis, Fabian J. ;
Streets, Aaron ;
Jordan, Michael I. ;
Regier, Jeffrey ;
Yosef, Nir .
NATURE BIOTECHNOLOGY, 2022, 40 (02) :163-166