scSensitiveGeneDefine: sensitive gene detection in single-cell RNA sequencing data by Shannon entropy

被引:5
作者
Chen, Zechuan [1 ,2 ]
Yang, Zeruo [3 ]
Yuan, Xiaojun [1 ]
Zhang, Xiaoming [2 ]
Hao, Pei [2 ]
机构
[1] Shanghai Univ, Coll Life Sci, Shanghai, Peoples R China
[2] Chinese Acad Sci, Inst Pasteur Shanghai, Key Lab Mol Virol & Immunol, Shanghai, Peoples R China
[3] Zhejiang YangShengTang Co Ltd, Nat Med Inst, 181 Geyazhuang, Hangzhou, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Sensitive genes; Single-cell RNA sequencing; Stochastic gene expression; Unsupervised clustering; EXPRESSION; VARIABILITY;
D O I
10.1186/s12859-021-04136-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Single-cell RNA sequencing (scRNA-seq) is the most widely used technique to obtain gene expression profiles from complex tissues. Cell subsets and developmental states are often identified via differential gene expression patterns. Most of the single-cell tools utilized highly variable genes to annotate cell subsets and states. However, we have discovered that a group of genes, which sensitively respond to environmental stimuli with high coefficients of variation (CV), might impose overwhelming influences on the cell type annotation. Result In this research, we developed a method, based on the CV-rank and Shannon entropy, to identify these noise genes, and termed them as "sensitive genes". To validate the reliability of our methods, we applied our tools in 11 single-cell data sets from different human tissues. The results showed that most of the sensitive genes were enriched pathways related to cellular stress response. Furthermore, we noticed that the unsupervised result was closer to the ground-truth cell labels, after removing the sensitive genes detected by our tools. Conclusion Our study revealed the prevalence of stochastic gene expression patterns in most types of cells, compared the differences among cell marker genes, housekeeping genes (HK genes), and sensitive genes, demonstrated the similarities of functions of sensitive genes in various scRNA-seq data sets, and improved the results of unsupervised clustering towards the ground-truth labels. We hope our method would provide new insights into the reduction of data noise in scRNA-seq data analysis and contribute to the development of better scRNA-seq unsupervised clustering algorithms in the future.
引用
收藏
页数:13
相关论文
共 32 条
[21]   DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors [J].
McGinnis, Christopher S. ;
Murrow, Lyndsay M. ;
Gartner, Zev J. .
CELL SYSTEMS, 2019, 8 (04) :329-+
[22]   Single-Cell Expression Variability Implies Cell Function [J].
Osorio, Daniel ;
Yu, Xue ;
Zhong, Yan ;
Li, Guanxun ;
Serpedin, Erchin ;
Huang, Jianhua Z. ;
Cai, James J. .
CELLS, 2020, 9 (01)
[23]   Using Cell-to-Cell Variability-A New Era in Molecular Biology [J].
Pelkmans, Lucas .
SCIENCE, 2012, 336 (6080) :425-426
[24]   Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences [J].
Raj, Arjun ;
van Oudenaarden, Alexander .
CELL, 2008, 135 (02) :216-226
[25]   Spatial reconstruction of single-cell gene expression data [J].
Satija, Rahul ;
Farrell, Jeffrey A. ;
Gennert, David ;
Schier, Alexander F. ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2015, 33 (05) :495-U206
[26]   A Single-Cell Sequencing Guide for Immunologists [J].
See, Peter ;
Lum, Josephine ;
Chen, Jinmiao ;
Ginhoux, Florent .
FRONTIERS IN IMMUNOLOGY, 2018, 9
[27]  
Tang FC, 2009, NAT METHODS, V6, P377, DOI [10.1038/nmeth.1315, 10.1038/NMETH.1315]
[28]   Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments [J].
Tian, Luyi ;
Dong, Xueyi ;
Freytag, Saskia ;
Le Cao, Kim-Anh ;
Su, Shian ;
JalalAbadi, Abolfazl ;
Amann-Zalcenstein, Daniela ;
Weber, Tom S. ;
Seidi, Azadeh ;
Jabbari, Jafar S. ;
Naik, Shalin H. ;
Ritchie, Matthew E. .
NATURE METHODS, 2019, 16 (06) :479-+
[29]   Complex Oscillatory Waves Emerging from Cortical Organoids Model Early Human Brain Network Development [J].
Trujillo, Cleber A. ;
Gao, Richard ;
Negraes, Priscilla D. ;
Gu, Jing ;
Buchanan, Justin ;
Preissl, Sebastian ;
Wang, Allen ;
Wu, Wei ;
Haddad, Gabriel G. ;
Chaim, Isaac A. ;
Domissy, Alain ;
Vandenberghe, Matthieu ;
Devor, Anna ;
Yeo, Gene W. ;
Voytek, Bradley ;
Muotri, Alysson R. .
CELL STEM CELL, 2019, 25 (04) :558-+
[30]   Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data [J].
Yip, Shun H. ;
Sham, Pak Chung ;
Wang, Junwen .
BRIEFINGS IN BIOINFORMATICS, 2019, 20 (04) :1583-1589