scButterfly: a versatile single-cell cross-modality translation method via dual-aligned variational autoencoders

被引:11
作者
Cao, Yichuan [1 ,2 ]
Zhao, Xiamiao [1 ,2 ]
Tang, Songming [1 ,2 ]
Jiang, Qun [3 ]
Li, Sijie [1 ,2 ]
Li, Siyu [4 ]
Chen, Shengquan [1 ,2 ]
机构
[1] Nankai Univ, Sch Math Sci, Tianjin 300071, Peoples R China
[2] Nankai Univ, LPMC, Tianjin 300071, Peoples R China
[3] Tsinghua Univ, Dept Automat, Key Lab Bioinformat & Bioinformat Div, MOE,BNRIST, Beijing 100084, Peoples R China
[4] Nankai Univ, Sch Stat & Data Sci, Tianjin 300071, Peoples R China
基金
中国国家自然科学基金;
关键词
T-CELLS; RNA;
D O I
10.1038/s41467-024-47418-x
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advancements for simultaneously profiling multi-omics modalities within individual cells have enabled the interrogation of cellular heterogeneity and molecular hierarchy. However, technical limitations lead to highly noisy multi-modal data and substantial costs. Although computational methods have been proposed to translate single-cell data across modalities, broad applications of the methods still remain impeded by formidable challenges. Here, we propose scButterfly, a versatile single-cell cross-modality translation method based on dual-aligned variational autoencoders and data augmentation schemes. With comprehensive experiments on multiple datasets, we provide compelling evidence of scButterfly's superiority over baseline methods in preserving cellular heterogeneity while translating datasets of various contexts and in revealing cell type-specific biological insights. Besides, we demonstrate the extensive applications of scButterfly for integrative multi-omics analysis of single-modality data, data enhancement of poor-quality single-cell multi-omics, and automatic cell type annotation of scATAC-seq data. Moreover, scButterfly can be generalized to unpaired data training, perturbation-response analysis, and consecutive translation. Technical limitations of simultaneously multi-omics profiling lead to highly noisy multi-modal data and substantial costs. Here, authors proposed a versatile framework and data augmentation schemes, capable of single-cell cross-modality translation and multiple extensive applications.
引用
收藏
页数:17
相关论文
共 68 条
[1]   A comparison of automatic cell identification methods for single-cell RNA sequencing data [J].
Abdelaal, Tamim ;
Michielsen, Lieke ;
Cats, Davy ;
Hoogduin, Dylan ;
Mei, Hailiang ;
Reinders, Marcel J. T. ;
Mahfouz, Ahmed .
GENOME BIOLOGY, 2019, 20 (01)
[2]   Computational principles and challenges in single-cell data integration [J].
Argelaguet, Ricard ;
Cuomo, Anna S. E. ;
Stegle, Oliver ;
Marioni, John C. .
NATURE BIOTECHNOLOGY, 2021, 39 (10) :1202-1215
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   MultiVI: deep generative model for the integration of multimodal data [J].
Ashuach, Tal ;
Gabitto, Mariano I. ;
Koodli, Rohan V. ;
Saldi, Giuseppe-Antonio ;
Jordan, Michael I. ;
Yosef, Nir .
NATURE METHODS, 2023, 20 (08) :1222-+
[5]   Displacement Interpolation Using Lagrangian Mass Transport [J].
Bonneel, Nicolas ;
van de Panne, Michiel ;
Paris, Sylvain ;
Heidrich, Wolfgang .
ACM TRANSACTIONS ON GRAPHICS, 2011, 30 (06)
[6]   A human cell atlas of fetal gene expression [J].
Cao, Junyue ;
O'Day, Diana R. ;
Pliner, Hannah A. ;
Kingsley, Paul D. ;
Deng, Mei ;
Daza, Riza M. ;
Zager, Michael A. ;
Aldinger, Kimberly A. ;
Blecher-Gonen, Ronnie ;
Zhang, Fan ;
Spielmann, Malte ;
Palis, James ;
Doherty, Dan ;
Steemers, Frank J. ;
Glass, Ian A. ;
Trapnell, Cole ;
Shendure, Jay .
SCIENCE, 2020, 370 (6518) :808-+
[7]   Joint profiling of chromatin accessibility and gene expression in thousands of single cells [J].
Cao, Junyue ;
Cusanovich, Darren A. ;
Ramani, Vijay ;
Aghamirzaie, Delasa ;
Pliner, Hannah A. ;
Hill, Andrew J. ;
Daza, Riza M. ;
McFaline-Figueroa, Jose L. ;
Packer, Jonathan S. ;
Christiansen, Lena ;
Steemers, Frank J. ;
Adey, Andrew C. ;
Trapnell, Cole ;
Shendure, Jay .
SCIENCE, 2018, 361 (6409) :1380-1385
[8]  
Cao Yichuan, 2023, Zenodo, DOI 10.5281/ZENODO.8339632
[9]   Multi-omics single-cell data integration and regulatory inference with graph-linked embedding [J].
Cao, Zhi-Jie ;
Gao, Ge .
NATURE BIOTECHNOLOGY, 2022, 40 (10) :1458-+
[10]   The Gene Ontology resource: enriching a GOld mine [J].
Carbon, Seth ;
Douglass, Eric ;
Good, Benjamin M. ;
Unni, Deepak R. ;
Harris, Nomi L. ;
Mungall, Christopher J. ;
Basu, Siddartha ;
Chisholm, Rex L. ;
Dodson, Robert J. ;
Hartline, Eric ;
Fey, Petra ;
Thomas, Paul D. ;
Albou, Laurent-Philippe ;
Ebert, Dustin ;
Kesling, Michael J. ;
Mi, Huaiyu ;
Muruganujan, Anushya ;
Huang, Xiaosong ;
Mushayahama, Tremayne ;
LaBonte, Sandra A. ;
Siegele, Deborah A. ;
Antonazzo, Giulia ;
Attrill, Helen ;
Brown, Nick H. ;
Garapati, Phani ;
Marygold, Steven J. ;
Trovisco, Vitor ;
Dos Santos, Gil ;
Falls, Kathleen ;
Tabone, Christopher ;
Zhou, Pinglei ;
Goodman, Joshua L. ;
Strelets, Victor B. ;
Thurmond, Jim ;
Garmiri, Penelope ;
Ishtiaq, Rizwan ;
Rodriguez-Lopez, Milagros ;
Acencio, Marcio L. ;
Kuiper, Martin ;
Laegreid, Astrid ;
Logie, Colin ;
Lovering, Ruth C. ;
Kramarz, Barbara ;
Saverimuttu, Shirin C. C. ;
Pinheiro, Sandra M. ;
Gunn, Heather ;
Su, Renzhi ;
Thurlow, Katherine E. ;
Chibucos, Marcus ;
Giglio, Michelle .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D325-D334