Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure

被引:14
作者
Hsiao, Chiaowen [1 ,2 ]
Liu, Mengya [3 ]
Stanton, Rick [4 ]
McGee, Monnie [3 ]
Qian, Yu [4 ]
Scheuermann, Richard H. [4 ,5 ]
机构
[1] Univ Maryland, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
[2] Univ Maryland, Appl Math Appl Stat & Sci Comp, College Pk, MD 20742 USA
[3] So Methodist Univ, Dept Stat Sci, Dallas, TX 75275 USA
[4] J Craig Venter Inst, Dept Informat, 4120 Capricorn Lane, La Jolla, CA 92037 USA
[5] Univ Calif San Diego, Dept Pathol, San Diego, CA 92103 USA
关键词
flow cytometry; cross-sample comparison; cell population matching; Friedman-Rafsky test; minimum spanning tree; single-cell analysis; RUSH IMMUNOTHERAPY; WALD-WOLFOWITZ; OMALIZUMAB;
D O I
10.1002/cyto.a.22735
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Flow cytometry (FCM) is a fluorescence-based single-cell experimental technology that is routinely applied in biomedical research for identifying cellular biomarkers of normal physiological responses and abnormal disease states. While many computational methods have been developed that focus on identifying cell populations in individual FCM samples, very few have addressed how the identified cell populations can be matched across samples for comparative analysis. This article presents FlowMap-FR, a novel method for cell population mapping across FCM samples. FlowMap-FR is based on the Friedman-Rafsky nonparametric test statistic (FR statistic), which quantifies the equivalence of multivariate distributions. As applied to FCM data by FlowMap-FR, the FR statistic objectively quantifies the similarity between cell populations based on the shapes, sizes, and positions of fluorescence data distributions in the multidimensional feature space. To test and evaluate the performance of FlowMap-FR, we simulated the kinds of biological and technical sample variations that are commonly observed in FCM data. The results show that FlowMap-FR is able to effectively identify equivalent cell populations between samples under scenarios of proportion differences and modest position shifts. As a statistical test, FlowMap-FR can be used to determine whether the expression of a cellular marker is statistically different between two cell populations, suggesting candidates for new cellular phenotypes by providing an objective statistical measure. In addition, FlowMap-FR can indicate situations in which inappropriate splitting or merging of cell populations has occurred during gating procedures. We compared the FR statistic with the symmetric version of Kullback-Leibler divergence measure used in a previous population matching method with both simulated and real data. The FR statistic outperforms the symmetric version of KL-distance in distinguishing equivalent from nonequivalent cell populations. FlowMap-FR was also employed as a distance metric to match cell populations delineated by manual gating across 30 FCM samples from a benchmark FlowCAP data set. An F-measure of 0.88 was obtained, indicating high precision and recall of the FR-based population matching results. FlowMap-FR has been implemented as a standalone R/Bioconductor package so that it can be easily incorporated into current FCM data analytical workflows. (C) 2015 The Authors. Published by Wiley Periodicals, Inc. on behalf of ISAC.
引用
收藏
页码:71 / 88
页数:18
相关论文
共 28 条
[1]  
Aghaeepour N, 2013, NAT METHODS, V10, P228, DOI [10.1038/NMETH.2365, 10.1038/nmeth.2365]
[2]   Mechanisms and treatment of allergic disease in the big picture of regulatory T cells [J].
Akdis, Cezmi A. ;
Akdis, Muebeccel .
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2009, 123 (04) :735-746
[3]  
[Anonymous], 2011, Proceedings of the 2Nd International Workshop on Petascal Data Analytics: Challenges and Opportunities, PDAC'11
[4]  
Azad A., 2013, healthyFlowData : Healthy dataset used by the flowMatch package
[5]   Matching phosphorylation response patterns of antigen-receptor-stimulated T cells via flow cytometry [J].
Azad, Ariful ;
Pyne, Saumyadipta ;
Pothen, Alex .
BMC BIOINFORMATICS, 2012, 13
[6]   Omalizumab pretreatment decreases acute reactions after rush immunotherapy for ragweed-induced seasonal allergic rhinitis [J].
Casale, TB ;
Busse, WW ;
Kline, JN ;
Ballas, ZK ;
Moss, MH ;
Townley, RG ;
Mokhtarani, M ;
Seyfert-Margolis, V ;
Asare, A ;
Bateman, K ;
Deniz, Y .
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2006, 117 (01) :134-140
[7]  
Chattopadhyay P, 2014, CYTO C PLEN PRES ABS, P388
[8]   Hierarchical Modeling for Rare Event Detection and Cell Subset Alignment across Flow Cytometry Samples [J].
Cron, Andrew ;
Gouttefangeas, Cecile ;
Frelinger, Jacob ;
Lin, Lin ;
Singh, Satwinder K. ;
Britten, Cedrik M. ;
Welters, Marij J. P. ;
van der Burg, Sjoerd H. ;
West, Mike ;
Chan, Cliburn .
PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (07)
[9]   Optimizing transformations for automated, high throughput analysis of flow cytometry data [J].
Finak, Greg ;
Perez, Juan-Manuel ;
Weng, Andrew ;
Gottardo, Raphael .
BMC BIOINFORMATICS, 2010, 11
[10]   MULTIVARIATE GENERALIZATIONS OF THE WALD-WOLFOWITZ AND SMIRNOV 2-SAMPLE TESTS [J].
FRIEDMAN, JH ;
RAFSKY, LC .
ANNALS OF STATISTICS, 1979, 7 (04) :697-717