MVCLST: A spatial transcriptome data analysis pipeline for cell type classification based on multi-view comparative learning

被引:0
作者
Peng, Wei [1 ,2 ]
Zhang, Zhihao [1 ]
Dai, Wei [1 ,2 ]
Ping, Zhihao [1 ]
Fu, Xiaodong [1 ,2 ]
Liu, Li [1 ,2 ]
Liu, Lijun [1 ,2 ]
Yu, Ning [3 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming 650050, Peoples R China
[2] Kunming Univ Sci & Technol, Comp Technol Applicat Key Lab Yunnan Prov, Kunming 650050, Peoples R China
[3] State Univ New York, Coll Brockport, Dept Comp Sci, 350 New Campus Dr, Brockport, NY 14422 USA
基金
中国国家自然科学基金;
关键词
Spatial transcriptome data clustering; Cell type identification; Multi-view; Contrastive learning; Consensus clustering; EXPRESSION;
D O I
10.1016/j.ymeth.2024.11.001
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recent advancements in spatial transcriptomics sequencing technologies can not only provide gene expression within individual cells or cell clusters (spots) in a tissue but also pinpoint the exact location of this expression and generate detailed images of stained tissue sections, which offers invaluable insights into cell type identification and cell function exploration. However, effectively integrating the gene expression data, spatial location information, and tissue images from spatial transcriptomics data presents a significant challenge for computational methods in cell classification. In this work, we propose MVCLST, a multi-view comparative learning method to analyze spatial transcriptomics data for accurate cell type classification. MVCLST constructs two views based on gene expression profiles, cell coordinates and image features. The multi-view method we proposed can significantly enhance the effectiveness of feature extraction while avoiding the impact of erroneous information in organizing image or gene expression data. The model employs four separate encoders to capture shared and unique features within each view. To ensure consistency and facilitate information exchange between the two views, MVCLST incorporates a contrastive learning loss function. The extracted shared and private features from both views are fused using corresponding decoders. Finally, the model utilizes the Leiden algorithm to cluster the learned features for cell type identification. Additionally, we establish a framework called MVCLST-CCFS for spatial transcriptomics data analysis based on MVCLST and consistent clustering. Our method achieves excellent results in clustering on human dorsolateral prefrontal cortex data and the mouse brain tissue data. It also outperforms state-of-the-art techniques in the subsequent search for highly variable genes across cell types on the mouse olfactory bulb data.
引用
收藏
页码:115 / 128
页数:14
相关论文
共 36 条
  • [1] Principal component analysis
    Bro, Rasmus
    Smilde, Age K.
    [J]. ANALYTICAL METHODS, 2014, 6 (09) : 2812 - 2831
  • [2] Chen YX, 2024, BRIEF BIOINFORM, V25, DOI 10.1093/bib/bbae101
  • [3] Spatial organization of the somatosensory cortex revealed by osmFISH
    Codeluppi, Simone
    Borm, Lars E.
    Zeisel, Amit
    La Manno, Gioele
    van Lunteren, Josina A.
    Svensson, Camilla I.
    Linnarsson, Sten
    [J]. NATURE METHODS, 2018, 15 (11) : 932 - +
  • [4] Hotspot identifies informative gene modules across modalities of single-cell genomics
    DeTomaso, David
    Yosef, Nir
    [J]. CELL SYSTEMS, 2021, 12 (05) : 446 - +
  • [5] Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder
    Dong, Kangning
    Zhang, Shihua
    [J]. NATURE COMMUNICATIONS, 2022, 13 (01)
  • [6] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [7] Giotto: a toolbox for integrative analysis and visualization of spatial expression data
    Dries, Ruben
    Zhu, Qian
    Dong, Rui
    Eng, Chee-Huat Linus
    Li, Huipeng
    Liu, Kan
    Fu, Yuntian
    Zhao, Tianxiao
    Sarkar, Arpan
    Bao, Feng
    George, Rani E.
    Pierson, Nico
    Cai, Long
    Yuan, Guo-Cheng
    [J]. GENOME BIOLOGY, 2021, 22 (01)
  • [8] Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH
    Eng, Chee-Huat Linus
    Lawson, Michael
    Zhu, Qian
    Dries, Ruben
    Koulena, Noushin
    Takei, Yodai
    Yun, Jina
    Cronin, Christopher
    Karp, Christoph
    Yuan, Guo-Cheng
    Cai, Long
    [J]. NATURE, 2019, 568 (7751) : 235 - +
  • [9] SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network
    Hu, Jian
    Li, Xiangjie
    Coleman, Kyle
    Schroeder, Amelia
    Ma, Nan
    Irwin, David J.
    Lee, Edward B.
    Shinohara, Russell T.
    Li, Mingyao
    [J]. NATURE METHODS, 2021, 18 (11) : 1342 - +
  • [10] scGGAN: single-cell RNA-seq imputation by graph-based generative adversarial network
    Huang, Zimo
    Wang, Jun
    Lu, Xudong
    Zain, Azlan Mohd
    Yu, Guoxian
    [J]. BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)