SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data

被引:0
|
作者
Ji, Boya [1 ]
Wang, Xiaoqi [2 ]
Qiao, Debin [3 ,4 ]
Xu, Liwen [1 ]
Peng, Shaoliang [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[2] Northwestern Polytech Univ, Sch Comp Sci, Xian 710000, Peoples R China
[3] Zhengzhou Univ, Sch Comp & Artificial Intelligence, Zhengzhou 450001, Peoples R China
[4] Zhengzhou Univ, Natl Supercomp Ctr Zhengzhou, Zhengzhou 450001, Peoples R China
来源
BIG DATA MINING AND ANALYTICS | 2024年 / 7卷 / 04期
基金
中国国家自然科学基金;
关键词
Accuracy; Large language models; Transcriptomics; Data visualization; Receivers; Spatial databases; Biology; Reliability; Spatial resolution; Signal resolution; Large Language Models (LLM); spatial transcriptome data; Cell-Cell Communications (CCCs); functional gene interaction networks; unified latent space;
D O I
10.26599/BDMA.2024.9020056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Drawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-cell communications for spatially resolved transcriptomic data. Here, we propose SpaCCC to facilitate the inference of spatially resolved cell-cell communications, which relies on our fine-tuned single-cell LLM and functional gene interaction network to embed ligand and receptor genes into a unified latent space. The LR pairs with a significant closer distance in latent space are taken to be more likely to interact with each other. After that, the molecular diffusion and permutation test strategies are respectively employed to calculate the communication strength and filter out communications with low specificities. The benchmarked performance of SpaCCC is evaluated on real single-cell spatial transcriptomic datasets with superiority over other methods. SpaCCC also infers known LR pairs concealed by existing aggregative methods and then identifies communication patterns for specific cell types and their signaling pathways. Furthermore, SpaCCC provides various cell-cell communication visualization results at both single-cell and cell type resolution. In summary, SpaCCC provides a sophisticated and practical tool allowing researchers to decipher spatially resolved cell-cell communications and related communication patterns and signaling pathways based on spatial transcriptome data. SpaCCC is free and publicly available at https://github.com/jiboyalab/SpaCCC.
引用
收藏
页码:1129 / 1147
页数:19
相关论文
共 7 条
  • [1] A model-based constrained deep learning clustering approach for spatially resolved single-cell data
    Lin, Xiang
    Gao, Le
    Whitener, Nathan
    Ahmed, Ashley
    Wei, Zhi
    GENOME RESEARCH, 2022, 32 (10) : 1906 - 1917
  • [2] Characterizing spatial gene expression heterogeneity in spatially resolved single-cell transcriptomic data with nonuniform cellular densities
    Miller, Brendan F.
    Bambah-Mukku, Dhananjay
    Dulac, Catherine
    Zhuang, Xiaowei
    Fan, Jean
    GENOME RESEARCH, 2021, 31 (10) : 1843 - 1855
  • [3] Exploring the potential of large language model-based chatbots in challenges of ribosome profiling data analysis: a review
    Ding, Zheyu
    Wei, Rong
    Xia, Jianing
    Mu, Yonghao
    Wang, Jiahuan
    Lin, Yingying
    BRIEFINGS IN BIOINFORMATICS, 2024, 26 (01)
  • [4] Pareto task inference analysis reveals cellular trade-offs in diffuse large B-Cell lymphoma transcriptomic data
    Blais, Jonatan
    Jeukens, Julie
    FRONTIERS IN SYSTEMS BIOLOGY, 2024, 4
  • [5] RumorLLM: A Rumor Large Language Model-Based Fake-News-Detection Data-Augmentation Approach
    Lai, Jianqiao
    Yang, Xinran
    Luo, Wenyue
    Zhou, Linjiang
    Li, Langchen
    Wang, Yongqi
    Shi, Xiaochuan
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [6] A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports
    Sushil, Madhumita
    Zack, Travis
    Mandair, Divneet
    Zheng, Zhiwei
    Wali, Ahmed
    Yu, Yan-Ning
    Quan, Yuwei
    Lituiev, Dmytro
    Butte, Atul J.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (10) : 2315 - 2327
  • [7] Ethical Education Data Mining Framework for Analyzing and Evaluating Large Language Model-Based Conversational Intelligent Tutoring Systems for Management and Entrepreneurship Courses
    Ilagan, Joseph Benjamin R.
    Ilagan, Jose Ramon S.
    Rodrigo, Maria Mercedes T.
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 1, ICICT 2024, 2024, 1011 : 61 - 71