SCINA: A Semi-Supervised Subtyping Algorithm of Single Cells and Bulk Samples

被引:125
作者
Zhang, Ze [1 ]
Luo, Danni [2 ]
Zhong, Xue [3 ]
Choi, Jin Huk [3 ]
Ma, Yuanqing [4 ,5 ]
Wang, Stacy [1 ]
Mahrt, Elena [3 ]
Guo, Wei [6 ]
Stawiski, Eric W. [7 ,8 ]
Modrusan, Zora [7 ]
Seshagiri, Somasekar [7 ]
Kapur, Payal [4 ,9 ]
Hon, Gary C. [10 ]
Brugarolas, James [4 ,5 ]
Wang, Tao [1 ,3 ,4 ]
机构
[1] Univ Texas Southwestern Med Ctr Dallas, Quantitat Biomed Res Ctr, Dept Populat & Data Sci, Dallas, TX 75390 USA
[2] Univ Texas Southwestern Med Ctr Dallas, Bioinformat Core Facil, Dallas, TX 75390 USA
[3] Univ Texas Southwestern Med Ctr Dallas, Ctr Genet Host Def, Dallas, TX 75390 USA
[4] Univ Texas Southwestern Med Ctr Dallas, Kidney Canc Program, Simmons Comprehens Canc Ctr, Dallas, TX 75390 USA
[5] Univ Texas Southwestern Med Ctr Dallas, Dept Internal Med, Dallas, TX 75390 USA
[6] Univ Texas Southwestern Med Ctr Dallas, BioHPC, Dallas, TX 75390 USA
[7] Genentech Inc, Mol Biol Dept, San Francisco, CA 94080 USA
[8] Genentech Inc, Bioinformat & Computat Biol Dept, San Francisco, CA 94080 USA
[9] Univ Texas Southwestern Med Ctr Dallas, Dept Pathol, Dallas, TX 75390 USA
[10] Univ Texas Southwestern Med Ctr Dallas, Lab Regulatory Genom, Cecil H & Ida Green Ctr Reprod Biol Sci, Div Basic Reprod Biol Res,Dept Obstet & Gynecol, Dallas, TX 75390 USA
来源
GENES | 2019年 / 10卷 / 07期
基金
美国国家卫生研究院;
关键词
single-cell RNA-seq; CyTOF; SCINA; HLRCC; RCC; renal cell carcinoma; fumarase; fumarate hydratase; COMPREHENSIVE MOLECULAR CHARACTERIZATION; GENOME-WIDE ASSOCIATION; NEXT-GENERATION; POINT MUTATIONS; CANCER; IDENTIFICATION; DISCOVERY; FRAMEWORK;
D O I
10.3390/genes10070531
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Advances in single-cell RNA sequencing (scRNA-Seq) have allowed for comprehensive analyses of single cell data. However, current analyses of scRNA-Seq data usually start from unsupervised clustering or visualization. These methods ignore prior knowledge of transcriptomes and the probable structures of the data. Moreover, cell identification heavily relies on subjective and possibly inaccurate human inspection afterwards. To address these analytical challenges, we developed SCINA (Semi-supervised Category Identification and Assignment), a semi-supervised model that exploits previously established gene signatures using an expectation-maximization (EM) algorithm. SCINA is applicable to scRNA-Seq and flow cytometry/CyTOF data, as well as other data of similar format. We applied SCINA to a wide range of datasets, and showed its accuracy, stability and efficiency, which exceeded most popular unsupervised approaches. SCINA discovered an intermediate stage of oligodendrocytes from mouse brain scRNA-Seq data. SCINA also detected immune cell population changes in cytometry data in a genetically-engineered mouse model. Furthermore, SCINA performed well with bulk gene expression data. Specifically, we identified a new kidney tumor clade with similarity to FH-deficient tumors (FHD), which we refer to as FHD-like tumors (FHDL). Overall, SCINA provides both methodological advances and biological insights from perspectives different from traditional analytical methods.
引用
收藏
页数:17
相关论文
共 39 条
  • [1] The phenotype of human STK4 deficiency
    Abdollahpour, Hengameh
    Appaswamy, Giridharan
    Kotlarz, Daniel
    Diestelhorst, Jana
    Beier, Rita
    Schaeffer, Alejandro A.
    Gertz, E. Michael
    Schambach, Axel
    Kreipe, Hans H.
    Pfeifer, Dietmar
    Engelhardt, Karin R.
    Rezaei, Nima
    Grimbacher, Bodo
    Lohrmann, Sabine
    Sherkat, Roya
    Klein, Christoph
    [J]. BLOOD, 2012, 119 (15) : 3450 - 3457
  • [2] Archer E., 2016, RFPERMUTE ESTIMATE P
  • [3] Mst1 positively regulates B-cell receptor signaling via CD19 transcriptional levels
    Bai, Xiaoming
    Huang, Lu
    Niu, Linlin
    Zhang, Yongjie
    Wang, Jinzhi
    Sun, Xiaoyu
    Jiang, Hongyan
    Zhang, Zhiyong
    Miller, Heather
    Tao, Wufan
    Zhou, Xinyuan
    Zhao, Xiaodong
    Liu, Chaohong
    [J]. BLOOD ADVANCES, 2016, 1 (03) : 219 - 230
  • [4] Spatiotemporal Dynamics of Intratumoral Immune Cells Reveal the Immune Landscape in Human Cancer
    Bindea, Gabriela
    Mlecnik, Bernhard
    Tosolini, Marie
    Kirilovsky, Amos
    Waldner, Maximilian
    Obenauf, Anna C.
    Angell, Helen
    Fredriksen, Tessa
    Lafontaine, Lucie
    Berger, Anne
    Bruneval, Patrick
    Fridman, Wolf Herman
    Becker, Christoph
    Pages, Franck
    Speicher, Michael R.
    Trajanoski, Zlatko
    Galon, Jerome
    [J]. IMMUNITY, 2013, 39 (04) : 782 - 795
  • [5] Integrating single-cell transcriptomic data across different conditions, technologies, and species
    Butler, Andrew
    Hoffman, Paul
    Smibert, Peter
    Papalexi, Efthymia
    Satija, Rahul
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (05) : 411 - +
  • [6] Cytofkit: A Bioconductor Package for an Integrated Mass Cytometry Data Analysis Pipeline
    Chen, Hao
    Lau, Mai Chan
    Wong, Michael Thomas
    Newell, Evan W.
    Poidinger, Michael
    Chen, Jinmiao
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (09)
  • [7] SCREENING CyTOF-the next generation of cell detection
    Cheung, Regina K.
    Utz, Paul J.
    [J]. NATURE REVIEWS RHEUMATOLOGY, 2011, 7 (09) : 502 - 503
  • [8] Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples
    Cibulskis, Kristian
    Lawrence, Michael S.
    Carter, Scott L.
    Sivachenko, Andrey
    Jaffe, David
    Sougnez, Carrie
    Gabriel, Stacey
    Meyerson, Matthew
    Lander, Eric S.
    Getz, Gad
    [J]. NATURE BIOTECHNOLOGY, 2013, 31 (03) : 213 - 219
  • [9] A framework for variation discovery and genotyping using next-generation DNA sequencing data
    DePristo, Mark A.
    Banks, Eric
    Poplin, Ryan
    Garimella, Kiran V.
    Maguire, Jared R.
    Hartl, Christopher
    Philippakis, Anthony A.
    del Angel, Guillermo
    Rivas, Manuel A.
    Hanna, Matt
    McKenna, Aaron
    Fennell, Tim J.
    Kernytsky, Andrew M.
    Sivachenko, Andrey Y.
    Cibulskis, Kristian
    Gabriel, Stacey B.
    Altshuler, David
    Daly, Mark J.
    [J]. NATURE GENETICS, 2011, 43 (05) : 491 - +
  • [10] The prolyl isomerase FKBP25 regulates microtubule polymerization impacting cell cycle progression and genomic stability
    Dilworth, David
    Gudavicius, Geoff
    Xu, Xiaoxue
    Boyce, Andrew K. J.
    O'Sullivan, Connor
    Serpa, Jason J.
    Bilenky, Misha
    Petrochenko, Evgeniy V.
    Borchers, Christoph H.
    Hirst, Martin
    Swayne, Leigh Anne
    Howard, Perry
    Nelson, Christopher J.
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (05) : 2459 - 2478