GSEApy: a comprehensive package for performing gene set enrichment analysis in Python']Python

被引:348
作者
Fang, Zhuoqing [1 ]
Liu, Xinyuan [2 ]
Peltz, Gary [1 ]
机构
[1] Stanford Univ, Dept Anesthesia Pain & Perioperat Med, Sch Med, Stanford, CA 94305 USA
[2] Stanford Univ, Sch Med, Dept Otolaryngol Head & Neck Surg, Stanford, CA 94305 USA
关键词
CELL; SIGNATURE; CANCER;
D O I
10.1093/bioinformatics/btac757
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Gene set enrichment analysis (GSEA) is a commonly used algorithm for characterizing gene expression changes. However, the currently available tools used to perform GSEA have a limited ability to analyze large datasets, which is particularly problematic for the analysis of single-cell data. To overcome this limitation, we developed a GSEA package in Python (GSEApy), which could efficiently analyze large single-cell datasets. Results: We present a package (GSEApy) that performs GSEA in either the command line or Python environment. GSEApy uses a Rust implementation to enable it to calculate the same enrichment statistic as GSEA for a collection of pathways. The Rust implementation of GSEApy is 3-fold faster than the Numpy version of GSEApy (v0.10.8) and uses >4-fold less memory. GSEApy also provides an interface between Python and Enrichr web services, as well as for BioMart. The Enrichr application programming interface enables GSEApy to perform over-representation analysis for an input gene list. Furthermore, GSEApy consists of several tools, each designed to facilitate a particular type of enrichment analysis. Availability and implementation: The new GSEApy with Rust extension is deposited in PyPI: https://pypi.org/project/gseapy/. The GSEApy source code is freely available at https://github.com/zqfang/GSEApy. Also, the documentation website is available at https://gseapy.rtfd.io/. Contact: gpeltz@stanford.edu Supplementary information: Supplementary data are available at Bioinformatics online.
引用
收藏
页数:3
相关论文
共 19 条
[1]   Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1 [J].
Barbie, David A. ;
Tamayo, Pablo ;
Boehm, Jesse S. ;
Kim, So Young ;
Moody, Susan E. ;
Dunn, Ian F. ;
Schinzel, Anna C. ;
Sandy, Peter ;
Meylan, Etienne ;
Scholl, Claudia ;
Froehling, Stefan ;
Chan, Edmond M. ;
Sos, Martin L. ;
Michel, Kathrin ;
Mermel, Craig ;
Silver, Serena J. ;
Weir, Barbara A. ;
Reiling, Jan H. ;
Sheng, Qing ;
Gupta, Piyush B. ;
Wadlow, Raymond C. ;
Le, Hanh ;
Hoersch, Sebastian ;
Wittner, Ben S. ;
Ramaswamy, Sridhar ;
Livingston, David M. ;
Sabatini, David M. ;
Meyerson, Matthew ;
Thomas, Roman K. ;
Lander, Eric S. ;
Mesirov, Jill P. ;
Root, David E. ;
Gilliland, D. Gary ;
Jacks, Tyler ;
Hahn, William C. .
NATURE, 2009, 462 (7269) :108-U122
[2]   Enrichr: interactive and collaborative HTML']HTML5 gene list enrichment analysis tool [J].
Chen, Edward Y. ;
Tan, Christopher M. ;
Kou, Yan ;
Duan, Qiaonan ;
Wang, Zichen ;
Meirelles, Gabriela Vaz ;
Clark, Neil R. ;
Ma'ayan, Avi .
BMC BIOINFORMATICS, 2013, 14
[3]   Stem cell-like ALDHbright cellular states in EGFR-mutant non-small cell lung cancer A novel mechanism of acquired resistance to erlotinib targetable with the natural polyphenol silibinin [J].
Corominas-Faja, Bruna ;
Oliveras-Ferraros, Cristina ;
Cuyas, Elisabet ;
Segura-Carretero, Antonio ;
Joven, Jorge ;
Martin-Castillo, Begona ;
Barrajon-Catalan, Enrique ;
Micol, Vicente ;
Bosch-Barrera, Joaquim ;
Menendez, Javier A. .
CELL CYCLE, 2013, 12 (21) :3390-3404
[4]   BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis [J].
Durinck, S ;
Moreau, Y ;
Kasprzyk, A ;
Davis, S ;
De Moor, B ;
Brazma, A ;
Huber, W .
BIOINFORMATICS, 2005, 21 (16) :3439-3440
[5]   A human multi-lineage hepatic organoid model for liver fibrosis [J].
Guan, Yuan ;
Enejder, Annika ;
Wang, Meiyue ;
Fang, Zhuoqing ;
Cui, Lu ;
Chen, Shih-Yu ;
Wang, Jingxiao ;
Tan, Yalun ;
Wu, Manhong ;
Chen, Xinyu ;
Johansson, Patrik K. ;
Osman, Issra ;
Kunimoto, Koshi ;
Russo, Pierre ;
Heilshorn, Sarah C. ;
Peltz, Gary .
NATURE COMMUNICATIONS, 2021, 12 (01)
[6]   Multiplexed droplet single-cell RNA-sequencing using natural genetic variation [J].
Kang, Hyun Min ;
Subramaniam, Meena ;
Targ, Sasha ;
Michelle Nguyen ;
Maliskova, Lenka ;
McCarthy, Elizabeth ;
Wan, Eunice ;
Wong, Simon ;
Byrnes, Lauren ;
Lanata, Cristina M. ;
Gate, Rachel E. ;
Mostafavi, Sara ;
Marson, Alexander ;
Zaitlen, Noah ;
Criswell, Lindsey A. ;
Ye, Chun Jimmie .
NATURE BIOTECHNOLOGY, 2018, 36 (01) :89-+
[7]   Challenges in unsupervised clustering of single-cell RNA-seq data [J].
Kiselev, Vladimir Yu ;
Andrews, Tallulah S. ;
Hemberg, Martin .
NATURE REVIEWS GENETICS, 2019, 20 (05) :273-282
[8]  
Korotkevich G., 2021, bioRxiv, DOI DOI 10.1101/060012
[9]   Enrichr: a comprehensive gene set enrichment analysis web server 2016 update [J].
Kuleshov, Maxim V. ;
Jones, Matthew R. ;
Rouillard, Andrew D. ;
Fernandez, Nicolas F. ;
Duan, Qiaonan ;
Wang, Zichen ;
Koplev, Simon ;
Jenkins, Sherry L. ;
Jagodnik, Kathleen M. ;
Lachmann, Alexander ;
McDermott, Michael G. ;
Monteiro, Caroline D. ;
Gundersen, Gregory W. ;
Ma'ayan, Avi .
NUCLEIC ACIDS RESEARCH, 2016, 44 (W1) :W90-W97
[10]   Molecular profiling stratifies diverse phenotypes of treatment-refractory metastatic castration-resistant prostate cancer [J].
Labrecque, Mark P. ;
Coleman, Ilsa M. ;
Brown, Lisha G. ;
True, Lawrence D. ;
Kollath, Lori ;
Lakely, Bryce ;
Nguyen, Holly M. ;
Yang, Yu C. ;
da Costa, Rui M. Gil ;
Kaipainen, Arja ;
Coleman, Roger ;
Higano, Celestia S. ;
Yu, Evan Y. ;
Cheng, Heather H. ;
Mostaghel, Elahe A. ;
Montgomery, Bruce ;
Schweizer, Michael T. ;
Hsieh, Andrew C. ;
Lin, Daniel W. ;
Corey, Eva ;
Nelson, Peter S. ;
Morrissey, Colm .
JOURNAL OF CLINICAL INVESTIGATION, 2019, 129 (10) :4492-4505