Clustering of Small-Sample Single-Cell RNA-Seq Data via Feature Clustering and Selection

被引:2
作者
Vans, Edwin [1 ,2 ]
Sharma, Alok [1 ,5 ,6 ,7 ]
Patil, Ashwini [8 ]
Shigemizu, Daichi [3 ,4 ,5 ,6 ]
Tsunoda, Tatsuhiko [4 ,5 ,6 ]
机构
[1] Univ South Pacific, Sch Engn & Phys, Suva, Fiji
[2] Fiji Natl Univ, Sch Elect & Elect Engn, Suva, Fiji
[3] Natl Ctr Geriatr & Gerontol, Med Genome Ctr, Obu, Aichi 4748511, Japan
[4] Tokyo Med & Dent Univ TMDU, Med Res Inst, Dept Med Sci Math, Tokyo 1138510, Japan
[5] RIKEN, Ctr Integrat Med Sci, Yokohama, Kanagawa 2300045, Japan
[6] JST, CREST, Tokyo 1138510, Japan
[7] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld 4111, Australia
[8] Univ Tokyo, Inst Med Sci, Minato Ku, 4-6-1 Shirokanedai, Tokyo 1088639, Japan
来源
PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III | 2019年 / 11672卷
关键词
Single-cell RNA-Seq; Hierarchical clustering; Feature selection; HETEROGENEITY; EMBRYOS; FATE;
D O I
10.1007/978-3-030-29894-4_36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present FeatClust, a software tool for clustering small sample size single-cell RNA-Seq datasets. The FeatClust approach is based on feature selection. It divides features into several groups by performing agglomerative hierarchical clustering and then iteratively clustering the samples and removing features belonging to groups with the least variance across samples. The optimal number of feature groups is selected based on silhouette analysis on the clustered data, i.e., selecting the clustering with the highest average silhouette coefficient. FeatClust also allows one to visually choose the number of clusters if it is not known, by generating silhouette plot for a chosen number of groupings of the dataset. We cluster five small sample single-cell RNA-seq datasets and use the adjusted rand index metric to compare the results with other clustering packages. The results are promising and show the effectiveness of FeatClust on small sample size datasets.
引用
收藏
页码:445 / 456
页数:12
相关论文
共 27 条
  • [1] [Anonymous], 2016, NUCL ACIDS RES, DOI DOI 10.1093/NAR/GKW430
  • [2] [Anonymous], 2018, SEURAT R TOOLKIT SIN
  • [3] [Anonymous], 2007, 18 ANN ACM SIAM S DI
  • [4] Cell fate inclination within 2-cell and 4-cell mouse embryos revealed by single-cell RNA sequencing
    Blase, Fernando H.
    Cao, Xiaoyi
    Zhong, Sheng
    [J]. GENOME RESEARCH, 2014, 24 (11) : 1787 - 1796
  • [5] Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells
    Buettner, Florian
    Natarajan, Kedar N.
    Casale, F. Paolo
    Proserpio, Valentina
    Scialdone, Antonio
    Theis, Fabian J.
    Teichmann, Sarah A.
    Marioni, John C.
    Stegie, Oliver
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (02) : 155 - 160
  • [6] Integrating single-cell transcriptomic data across different conditions, technologies, and species
    Butler, Andrew
    Hoffman, Paul
    Smibert, Peter
    Papalexi, Efthymia
    Satija, Rahul
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (05) : 411 - +
  • [7] Single-cell RNA-seq transcriptome analysis of linear and circular RNAs in mouse preimplantation embryos
    Fan, Xiaoying
    Zhang, Xiannian
    Wu, Xinglong
    Guo, Hongshan
    Hu, Yuqiong
    Tang, Fuchou
    Huang, Yanyi
    [J]. GENOME BIOLOGY, 2015, 16
  • [8] Heterogeneity in Oct4 and Sox2 Targets Biases Cell Fate in 4-Cell Mouse Embryos
    Goolam, Mubeen
    Scialdone, Antonio
    Graham, Sarah J. L.
    Macaulay, Iain C.
    Jedrusik, Agnieszka
    Hupalowska, Anna
    Voet, Thierry
    Marioni, John C.
    Zernicka-Goetz, Magdalena
    [J]. CELL, 2016, 165 (01) : 61 - 74
  • [9] SINCERA: A Pipeline for Single-Cell RNA-Seq Profiling Analysis
    Guo, Minzhe
    Wang, Hui
    Potter, S. Steven
    Whitsett, Jeffrey A.
    Xu, Yan
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2015, 11 (11)
  • [10] Hebenstreit Daniel, 2012, Biology (Basel), V1, P658, DOI 10.3390/biology1030658