A HIERARCHICAL BAYESIAN MODEL FOR SINGLE-CELL CLUSTERING USING RNA-SEQUENCING DATA

被引:0
作者
Liu, Yiyi [1 ]
Warren, Joshua L. [1 ]
Zhao, Hongyu [1 ]
机构
[1] Yale Univ, Dept Biostat, Sch Publ Hlth, New Haven, CT 06520 USA
关键词
Bayesian hierarchical model; clustering; Dirichlet process; Gaussian mixture model; missing data; single-cell RNA-sequencing; TRANSCRIPTOMES; HETEROGENEITY; VISUALIZATION; CHALLENGES;
D O I
10.1214/19-AOAS1250
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Understanding the heterogeneity of cells is an important biological question. The development of single-cell RNA-sequencing (scRNA-seq) technology provides high resolution data for such inquiry. A key challenge in scRNA-seq analysis is the high variability of measured RNA expression levels and frequent dropouts (missing values) due to limited input RNA compared to bulk RNA-seq measurement. Existing clustering methods do not perform well for these noisy and zero-inflated scRNA-seq data. In this manuscript we propose a Bayesian hierarchical model, called BasClu, to appropriately characterize important features of scRNA-seq data in order to more accurately cluster cells. We demonstrate the effectiveness of our method with extensive simulation studies and applications to three real scRNA-seq datasets.
引用
收藏
页码:1733 / 1752
页数:20
相关论文
共 31 条
[1]  
Benaglia T, 2009, J STAT SOFTW, V32, P1
[2]  
Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/nmeth.2645, 10.1038/NMETH.2645]
[3]  
Caliski T., 1974, Commun Stat Simul Comput, V3, P1, DOI [10.1080/03610927408827101, DOI 10.1080/03610927408827101]
[4]   BAYESIAN ANALYSIS OF SOME NONPARAMETRIC PROBLEMS [J].
FERGUSON, TS .
ANNALS OF STATISTICS, 1973, 1 (02) :209-230
[5]   MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data [J].
Finak, Greg ;
McDavid, Andrew ;
Yajima, Masanao ;
Deng, Jingyuan ;
Gersuk, Vivian ;
Shalek, Alex K. ;
Slichter, Chloe K. ;
Miller, Hannah W. ;
McElrath, M. Juliana ;
Prlic, Martin ;
Linsley, Peter S. ;
Gottardo, Raphael .
GENOME BIOLOGY, 2015, 16
[6]  
FORGY EW, 1965, BIOMETRICS, V21, P768
[7]   Improved Criteria for Clustering Based on the Posterior Similarity Matrix [J].
Fritsch, Arno ;
Ickstadt, Katja .
BAYESIAN ANALYSIS, 2009, 4 (02) :367-391
[8]  
Gelman A., 1992, Statist. Sci., V7, P519, DOI [DOI 10.1214/SS/1177011136, 10.1214/ss/1177011136]
[9]   COMPARING PARTITIONS [J].
HUBERT, L ;
ARABIE, P .
JOURNAL OF CLASSIFICATION, 1985, 2 (2-3) :193-218
[10]   Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain [J].
Lake, Blue B. ;
Ai, Rizi ;
Kaeser, Gwendolyn E. ;
Salathia, Neeraj S. ;
Yung, Yun C. ;
Liu, Rui ;
Wildberg, Andre ;
Gao, Derek ;
Fung, Ho-Lim ;
Chen, Song ;
Vijayaraghavan, Raakhee ;
Wong, Julian ;
Chen, Allison ;
Sheng, Xiaoyan ;
Kaper, Fiona ;
Shen, Richard ;
Ronaghi, Mostafa ;
Fan, Jian-Bing ;
Wang, Wei ;
Chun, Jerold ;
Zhang, Kun .
SCIENCE, 2016, 352 (6293) :1586-1590