scDC: single cell differential composition analysis

被引:25
作者
Cao, Yue [1 ]
Lin, Yingxin [1 ]
Ormerod, John T. [1 ]
Yang, Pengyi [1 ,2 ,3 ]
Yang, Jean Y. H. [1 ,2 ]
Lo, Kitty K. [1 ]
机构
[1] Univ Sydney, Sch Math & Stat, Sydney, NSW 2006, Australia
[2] Univ Sydney, Charles Perkins Ctr, Sydney, NSW 2006, Australia
[3] Univ Sydney, Fac Med & Hlth, Childrens Med Res Inst, Sydney, NSW 2145, Australia
基金
澳大利亚国家健康与医学研究理事会; 澳大利亚研究理事会;
关键词
Single cell; RNA-seq; scRNA-seq; Composition analysis; SIMULTANEOUS CONFIDENCE-INTERVALS;
D O I
10.1186/s12859-019-3211-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite detail. However, a number of challenges remain with cell-type composition analysis - none of the existing methods can identify cell-type perfectly and variability related to cell sampling exists in any single cell experiment. This necessitates the development of method for estimating uncertainty in cell-type composition. Results: We developed a novel single cell differential composition (scDC) analysis method that performs differential cell-type composition analysis via bootstrap resampling. scDC captures the uncertainty associated with cell-type proportions of each subject via bias-corrected and accelerated bootstrap confidence intervals. We assessed the performance of our method using a number of simulated datasets and synthetic datasets curated from publicly available single cell datasets. In simulated datasets, scDC correctly recovered the true cell-type proportions. In synthetic datasets, the cell-type compositions returned by scDC were highly concordant with reference cell-type compositions from the original data. Since the majority of datasets tested in this study have only 2 to 5 subjects per condition, the addition of confidence intervals enabled better comparisons of compositional differences between subjects and across conditions. Conclusions: scDC is a novel statistical method for performing differential cell-type composition analysis for scRNA-seq data. It uses bootstrap resampling to estimate the standard errors associated with cell-type proportion estimates and performs significance testing through GLM and GLMM models. We have made this method available to the scientific community as part of the scdney package (Single Cell Data Integrative Analysis) R package, available from https://github.com/SydneyBioX/scdney.
引用
收藏
页数:12
相关论文
共 25 条
[21]   Balances: a New Perspective for Microbiome Analysis [J].
Rivera-Pinto, J. ;
Egozcue, J. J. ;
Pawlowsky-Glahn, V ;
Paredes, R. ;
Noguera-Julian, M. ;
Calle, M. L. .
MSYSTEMS, 2018, 3 (04)
[22]   Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes [J].
Segerstolpe, Asa ;
Palasantza, Athanasia ;
Eliasson, Pernilla ;
Andersson, Eva-Marie ;
Andreasson, Anne-Christine ;
Sun, Xiaoyan ;
Picelli, Simone ;
Sabirsh, Alan ;
Clausen, Maryam ;
Bjursell, Magnus K. ;
Smith, David M. ;
Kasper, Maria ;
Ammala, Carina ;
Sandberg, Rickard .
CELL METABOLISM, 2016, 24 (04) :593-607
[23]   Identification of grade and origin specific cell populations in serous epithelial ovarian cancer by single cell RNA-seq [J].
Shih, Andrew J. ;
Menzin, Andrew ;
Whyte, Jill ;
Lovecchio, John ;
Liew, Anthony ;
Khalili, Houman ;
Bhuiya, Tawfiqul ;
Gregersen, Peter K. ;
Lee, Annette T. .
PLOS ONE, 2018, 13 (11)
[24]   SIMULTANEOUS CONFIDENCE-INTERVALS AND SAMPLE-SIZE DETERMINATION FOR MULTINOMIAL PROPORTIONS [J].
SISON, CP ;
GLAZ, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (429) :366-369
[25]  
van Buuren S, 2011, J STAT SOFTW, V45, P1