SCANPY: large-scale single-cell gene expression data analysis

被引:3647
作者
Wolf, F. Alexander [1 ]
Angerer, Philipp [1 ]
Theis, Fabian J. [1 ,2 ]
机构
[1] Helmholtz Zentrum Munchen, German Res Ctr Environm Hlth, Inst Computat Biol, Munich, Germany
[2] Tech Univ Munich, Dept Math, Munich, Germany
来源
GENOME BIOLOGY | 2018年 / 19卷
关键词
Single-cell transcriptomics; Machine learning; Scalability; Graph analysis; Clustering; Pseudotemporal ordering; Trajectory inference; Differential expression testing; Visualization; Bioinformatics; HETEROGENEITY; RECONSTRUCTION; VISUALIZATION;
D O I
10.1186/s13059-017-1382-0
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
SCANPY is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells (https://github.com/theislab/Scanpy). Along with SCANPY, we present ANNDATA, a generic class for handling annotated data matrices (https://github.com/theislab/anndata).
引用
收藏
页数:5
相关论文
共 52 条
[1]  
Abadi M., 2016, TENSORFLOW LARGESCAL
[2]   viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia [J].
Amir, El-ad David ;
Davis, Kara L. ;
Tadmor, Michelle D. ;
Simonds, Erin F. ;
Levine, Jacob H. ;
Bendall, Sean C. ;
Shenfeld, Daniel K. ;
Krishnaswamy, Smita ;
Nolan, Garry P. ;
Pe'er, Dana .
NATURE BIOTECHNOLOGY, 2013, 31 (06) :545-+
[3]  
Angerer Philipp, 2017, Current Opinion in Systems Biology, V4, P85, DOI 10.1016/j.coisb.2017.07.004
[4]   destiny: diffusion maps for large-scale single cell data in R [J].
Angerer, Philipp ;
Haghverdi, Laleh ;
Buettner, Maren ;
Theis, Fabian J. ;
Marr, Carsten ;
Buettner, Florian .
BIOINFORMATICS, 2016, 32 (08) :1241-1243
[5]  
[Anonymous], 2001, SciPy: Open source scientific tools for Python
[6]  
Bastian M., 2009, 3 INT AAAI C WEBLOGS, DOI [10.13140/2.1.1341.1520, DOI 10.1609/ICWSM.V3I1.13937]
[7]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[8]   f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq [J].
Buettner, Florian ;
Pratanwanich, Naruemon ;
McCarthy, Davis J. ;
Marioni, John C. ;
Stegle, Oliver .
GENOME BIOLOGY, 2017, 18
[9]   Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells [J].
Buettner, Florian ;
Natarajan, Kedar N. ;
Casale, F. Paolo ;
Proserpio, Valentina ;
Scialdone, Antonio ;
Theis, Fabian J. ;
Teichmann, Sarah A. ;
Marioni, John C. ;
Stegie, Oliver .
NATURE BIOTECHNOLOGY, 2015, 33 (02) :155-160
[10]   Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps [J].
Coifman, RR ;
Lafon, S ;
Lee, AB ;
Maggioni, M ;
Nadler, B ;
Warner, F ;
Zucker, SW .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (21) :7426-7431