Accurate and fast cell marker gene identification with COSG

被引:72
作者
Dai, Min [2 ,4 ]
Pei, Xiaobing [3 ]
Wang, Xiu-Jie [1 ,2 ]
机构
[1] Chinese Acad Sci, Innovat Acad Seed Design, Inst Genet & Dev Biol, Beijing 100101, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Software, Wuhan 430074, Hubei, Peoples R China
[4] Chinese Acad Sci, Inst Genet & Dev Biol, Beijing, Peoples R China
基金
北京市自然科学基金;
关键词
cell marker gene; cosine similarity; single-cell RNA-seq; single-cell ATAC-seq; spatially resolved transcriptomics;
D O I
10.1093/bib/bbab579
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Accurate cell classification is the groundwork for downstream analysis of single-cell sequencing data, yet how to identify true marker genes for different cell types still remains a big challenge. Here, we report COSine similarity-based marker Gene identification (COSG) as a cosine similarity-based method for more accurate and scalable marker gene identification. COSG is applicable to single-cell RNA sequencing data, single-cell ATAC sequencing data and spatially resolved transcriptome data. COSG is fast and scalable for ultra-large datasets of million-scale cells. Application on both simulated and real experimental datasets showed that the marker genes or genomic regions identified by COSG have greater cell-type specificity, demonstrating the superior performance of COSG in terms of both accuracy and efficiency as compared with other available methods.
引用
收藏
页数:12
相关论文
共 33 条
[1]   A single-cell transcriptomic atlas characterizes ageing tissues in the mouse [J].
Almanzar, Nicole ;
Antony, Jane ;
Baghel, Ankit S. ;
Bakerman, Isaac ;
Bansal, Ishita ;
Barres, Ben A. ;
Beachy, Philip A. ;
Berdnik, Daniela ;
Bilen, Biter ;
Brownfield, Douglas ;
Cain, Corey ;
Chan, Charles K. F. ;
Chen, Michelle B. ;
Clarke, Michael F. ;
Conley, Stephanie D. ;
Darmanis, Spyros ;
Demers, Aaron ;
Demir, Kubilay ;
De Morree, Antoine ;
du Bois, Tessa Divita Haley ;
Ebadi, Hamid ;
Espinoza, F. Hernan ;
Fish, Matt ;
Gan, Qiang ;
George, Benson M. ;
Gillich, Astrid ;
Gomez-Sjoberg, Rafael ;
Green, Foad ;
Genetiano, Geraldine ;
Gu, Xueying ;
Gulati, Gunsagar S. ;
Hahn, Oliver ;
Haney, Michael Seamus ;
Hang, Yan ;
Harris, Lincoln ;
He, Mu ;
Hosseinzadeh, Shayan ;
Huang, Albin ;
Huang, Kerwyn Casey ;
Iram, Tal ;
Isobe, Taichi ;
Ives, Feather ;
Jones, Robert C. ;
Kao, Kevin S. ;
Karkanias, Jim ;
Karnam, Guruswamy ;
Keller, Andreas ;
Kershner, Aaron M. ;
Khoury, Nathalie ;
Kim, Seung K. .
NATURE, 2020, 583 (7817) :590-+
[2]   Integrating single-cell transcriptomic data across different conditions, technologies, and species [J].
Butler, Andrew ;
Hoffman, Paul ;
Smibert, Peter ;
Papalexi, Efthymia ;
Satija, Rahul .
NATURE BIOTECHNOLOGY, 2018, 36 (05) :411-+
[3]   The single-cell transcriptional landscape of mammalian organogenesis [J].
Cao, Junyue ;
Spielmann, Malte ;
Qiu, Xiaojie ;
Huang, Xingfan ;
Ibrahim, Daniel M. ;
Hill, Andrew J. ;
Zhang, Fan ;
Mundlos, Stefan ;
Christiansen, Lena ;
Steemers, Frank J. ;
Trapnell, Cole ;
Shendure, Jay .
NATURE, 2019, 566 (7745) :496-+
[4]   Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution [J].
Corces, M. Ryan ;
Buenrostro, Jason D. ;
Wu, Beijing ;
Greenside, Peyton G. ;
Chan, Steven M. ;
Koenig, Julie L. ;
Snyder, Michael P. ;
Pritchard, Jonathan K. ;
Kundaje, Anshul ;
Gkeenleaf, William J. ;
Majeti, Ravindra ;
Chang, Howard Y. .
NATURE GENETICS, 2016, 48 (10) :1193-1203
[5]  
Ding J, 2020, NAT BIOTECHNOL, V38, P737, DOI 10.1038/s41587-020-0465-8
[6]   MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data [J].
Finak, Greg ;
McDavid, Andrew ;
Yajima, Masanao ;
Deng, Jingyuan ;
Gersuk, Vivian ;
Shalek, Alex K. ;
Slichter, Chloe K. ;
Miller, Hannah W. ;
McElrath, M. Juliana ;
Prlic, Martin ;
Linsley, Peter S. ;
Gottardo, Raphael .
GENOME BIOLOGY, 2015, 16
[7]   Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia [J].
Granja, Jeffrey M. ;
Klemm, Sandy ;
McGinnis, Lisa M. ;
Kathiria, Arwa S. ;
Mezger, Anja ;
Corces, M. Ryan ;
Parks, Benjamin ;
Gars, Eric ;
Liedtke, Michaela ;
Zheng, Grace X. Y. ;
Chang, Howard Y. ;
Majeti, Ravindra ;
Greenleaf, William J. .
NATURE BIOTECHNOLOGY, 2019, 37 (12) :1458-+
[8]   Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors [J].
Haghverdi, Laleh ;
Lun, Aaron T. L. ;
Morgan, Michael D. ;
Marioni, John C. .
NATURE BIOTECHNOLOGY, 2018, 36 (05) :421-+
[9]   Construction of a human cell landscape at single-cell level [J].
Han, Xiaoping ;
Zhou, Ziming ;
Fei, Lijiang ;
Sun, Huiyu ;
Wang, Renying ;
Chen, Yao ;
Chen, Haide ;
Wang, Jingjing ;
Tang, Huanna ;
Ge, Wenhao ;
Zhou, Yincong ;
Ye, Fang ;
Jiang, Mengmeng ;
Wu, Junqing ;
Xiao, Yanyu ;
Jia, Xiaoning ;
Zhang, Tingyue ;
Ma, Xiaojie ;
Zhang, Qi ;
Bai, Xueli ;
Lai, Shujing ;
Yu, Chengxuan ;
Zhu, Lijun ;
Lin, Rui ;
Gao, Yuchi ;
Wang, Min ;
Wu, Yiqing ;
Zhang, Jianming ;
Zhan, Renya ;
Zhu, Saiyong ;
Hu, Hailan ;
Wang, Changchun ;
Chen, Ming ;
Huang, He ;
Liang, Tingbo ;
Chen, Jianghua ;
Wang, Weilin ;
Zhang, Dan ;
Guo, Guoji .
NATURE, 2020, 581 (7808) :303-+
[10]   Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing [J].
Hochgerner, Hannah ;
Zeisel, Amit ;
Lobnnerberg, Peter ;
Linnarsson, Sten .
NATURE NEUROSCIENCE, 2018, 21 (02) :290-+