KRATOS: Context-Aware Cell Type Classification and Interpretation Using Joint Dimensionality Reduction and Clustering

被引:0
作者
Zhou, Zihan [1 ]
Du, Zijia [2 ]
Chaterji, Somali [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
来源
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022 | 2022年
关键词
Single-cell RNA analysis; DNN; machine learning explanation; classification; perturbation methods; clustering; DIFFERENTIAL EXPRESSION;
D O I
10.1145/3534678.3539455
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A common workflow for single-cell RNA-sequencing (sc-RNA-seq) data analysis is to orchestrate a three-step pipeline. First, conduct a dimension reduction of the input cell profile matrix; second, cluster the cells in the latent space; and third, extract the "gene panels" that distinguish a certain cluster from others. This workflow has the primary drawback that the three steps are performed independently, neglecting the dependencies among the steps and among the marker genes or gene panels. In our system, Kratos, we alter the three-step workflow to a two-step one, where we jointly optimize the first two steps and add the third (interpretability) step to form an integrated sc-RNA-seq analysis pipeline. We show that the more compact workflow of Kratos extracts marker genes that can better discriminate the target cluster, distilling underlying mechanisms guiding cluster membership. In doing so, Kratos is significantly better than the two SOTA baselines we compare against, specifically 5.62% superior to Global Counterfactual Explanation (GCE) [ICML-20], and 3.31% better than Adversarial Clustering Explanation (ACE) [ICML-21], measured by the AUROC of a kernel-SVM classifier. We opensource our code and datasets here: https://github.com/icanfor ce/single- cell- genomics-kratos.
引用
收藏
页码:2616 / 2625
页数:10
相关论文
共 44 条
[1]  
Abid A., 2019, ICML
[2]   Exploring single-cell data with deep multitasking neural networks [J].
Amodio, Matthew ;
van Dijk, David ;
Srinivasan, Krishnan ;
Chen, William S. ;
Mohsen, Hussein ;
Moon, Kevin R. ;
Campbell, Allison ;
Zhao, Yujiao ;
Wang, Xiaomei ;
Venkataswamy, Manjunatha ;
Desai, Anita ;
Ravi, V. ;
Kumar, Priti ;
Montgomery, Ruth ;
Wolf, Guy ;
Krishnaswamy, Smita .
NATURE METHODS, 2019, 16 (11) :1139-+
[3]   Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data [J].
Andrews, Tallulah S. ;
Kiselev, Vladimir Yu ;
McCarthy, Davis ;
Hemberg, Martin .
NATURE PROTOCOLS, 2021, 16 (01) :1-9
[4]   Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets [J].
Argelaguet, Ricard ;
Velten, Britta ;
Arnol, Damien ;
Dietrich, Sascha ;
Zenz, Thorsten ;
Marioni, John C. ;
Buettner, Florian ;
Huber, Wolfgang ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (06)
[5]  
Aumann R.J., 1974, Values of Non-Atomic Games
[6]   A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure [J].
Baron, Maayan ;
Veres, Adrian ;
Wolock, Samuel L. ;
Faust, Aubrey L. ;
Gaujoux, Renaud ;
Vetere, Amedeo ;
Ryu, Jennifer Hyoje ;
Wagner, Bridget K. ;
Shen-Orr, Shai S. ;
Klein, Allon M. ;
Melton, Douglas A. ;
Yanai, Itai .
CELL SYSTEMS, 2016, 3 (04) :346-+
[7]   Dimensionality reduction for visualizing single-cell data using UMAP [J].
Becht, Etienne ;
McInnes, Leland ;
Healy, John ;
Dutertre, Charles-Antoine ;
Kwok, Immanuel W. H. ;
Ng, Lai Guan ;
Ginhoux, Florent ;
Newell, Evan W. .
NATURE BIOTECHNOLOGY, 2019, 37 (01) :38-+
[8]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[9]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[10]   Towards Evaluating the Robustness of Neural Networks [J].
Carlini, Nicholas ;
Wagner, David .
2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57