Realistic Cell Type Annotation and Discovery for Single-cell RNA-seq Data

被引:0
作者
Zhai, Yuyao [1 ]
Chen, Liang [4 ]
Deng, Minghua [1 ,2 ,3 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
[2] Peking Univ, Ctr Stat Sci, Beijing, Peoples R China
[3] Peking Univ, Ctr Quantitat Biol, Beijing, Peoples R China
[4] Huawei Technol Co Ltd, Shenzhen, Peoples R China
来源
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid development of single-cell RNA sequencing (scRNA-seq) technologies allows us to explore tissue heterogeneity at the cellular level. Cell annotation plays an essential role in the substantial downstream analysis of scRNA-seq data. Existing methods usually classify the novel cells in target data as an "unassigned" group and rarely discover the fine-grained cell type structure among them. Besides, these methods carry risks, such as susceptibility to batch effect between reference and target data, thus further compromising of inherent discrimination of target data. Considering these limitations, here we propose a new and practical task called realistic cell type annotation and discovery for scRNA-seq data. In this task, cells from seen cell types are given class labels, while cells from novel cell types are given cluster labels. To tackle this problem, we propose an end-to-end algorithm called scPOT from the perspective of optimal transport ( OT). Specifically, we first design an OT-based prototypical representation learning paradigm to encourage both global discriminations of clusters and local consistency of cells to uncover the intrinsic structure of target data. Then we propose an unbalanced OT-based partial alignment strategy with statistical filling to detect the cells from seen cell types across reference and target data. Notably, scPOT also introduces an easy yet effective solution to automatically estimate the total cell type number in target data. Extensive results on our carefully designed evaluation benchmarks demonstrate the superiority of scPOT over various state-of-the-art clustering and annotation methods.
引用
收藏
页码:4967 / 4974
页数:8
相关论文
共 25 条
[21]  
Villani C, 2009, GRUNDLEHR MATH WISS, V338, P5
[22]   scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data [J].
Wan, Hui ;
Chen, Liang ;
Deng, Minghua .
BIOINFORMATICS, 2022, 38 (06) :1575-1583
[23]   scCNC: a method based on capsule network for clustering scRNA-seq data [J].
Wang, Hai-Yun ;
Zhao, Jian-Ping ;
Zheng, Chun-Hou ;
Su, Yan-Sen .
BIOINFORMATICS, 2022, 38 (15) :3703-3709
[24]   Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models [J].
Xu, Chenling ;
Lopez, Romain ;
Mehlman, Edouard ;
Regier, Jeffrey ;
Jordan, Michael, I ;
Yosef, Nir .
MOLECULAR SYSTEMS BIOLOGY, 2021, 17 (01)
[25]   Comparative Analysis of Single-Cell RNA Sequencing Methods [J].
Ziegenhain, Christoph ;
Vieth, Beate ;
Parekh, Swati ;
Reinius, Bjorn ;
Guillaumet-Adkins, Amy ;
Smets, Martha ;
Leonhardt, Heinrich ;
Heyn, Holger ;
Hellmann, Ines ;
Enard, Wolfgang .
MOLECULAR CELL, 2017, 65 (04) :631-+