Hierarchical Clustering Can Identify B Cell Clones with High Confidence in Ig Repertoire Sequencing Data

被引:96
作者
Gupta, Namita T. [1 ]
Adams, Kristofor D. [2 ]
Briggs, Adrian W. [2 ]
Timberlake, Sonia C. [2 ]
Vigneault, Francois [2 ]
Kleinstein, Steven H. [1 ,3 ,4 ]
机构
[1] Yale Univ, Interdept Program Computat Biol & Bioinfomat, New Haven, CT 06520 USA
[2] AbVitro, Boston, MA 02210 USA
[3] Yale Sch Med, Dept Immunol, New Haven, CT 06520 USA
[4] Yale Sch Med, Dept Pathol, New Haven, CT 06520 USA
基金
美国国家卫生研究院;
关键词
HIV-1-NEUTRALIZING ANTIBODIES; IMMUNOGLOBULIN; GENERATION; IDENTIFICATION; MATURATION; DIVERSITY; INFECTION; RITUXIMAB; SELECTION; TOOLKIT;
D O I
10.4049/jimmunol.1601850
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Adaptive immunity is driven by the expansion, somatic hypermutation, and selection of B cell clones. Each clone is the progeny of a single B cell responding to Ag, with diversified Ig receptors. These receptors can now be profiled on a large scale by nextgeneration sequencing. Such data provide a window into the microevolutionary dynamics that drive successful immune responses and the dysregulation that occurs with aging or disease. Clonal relationships are not directly measured, but they must be computationally inferred from these sequencing data. Although several hierarchical clustering-based methods have been proposed, they vary in distance and linkage methods and have not yet been rigorously compared. In this study, we use a combination of human experimental and simulated data to characterize the performance of hierarchical clustering-based methods for partitioning sequences into clones. We find that single linkage clustering has high performance, with specificity, sensitivity, and positive predictive value all > 99%, whereas other linkages result in a significant loss of sensitivity. Surprisingly, distance metrics that incorporate the biases of somatic hypermutation do not outperform simple Hamming distance. Although errors were more likely in sequences with short junctions, using the entire dataset to choose a single distance threshold for clustering is near optimal. Our results suggest that hierarchical clustering using single linkage with Hamming distance identifies clones with high confidence and provides a fully automated method for clonal grouping. The performance estimates we develop provide important context to interpret clonal analysis of repertoire sequencing data and allow for rigorous testing of other clonal grouping algorithms.
引用
收藏
页码:2489 / 2499
页数:11
相关论文
共 60 条
[41]   Visualizing antibody affinity maturation in germinal centers [J].
Tas, Jeroen M. J. ;
Mesin, Luka ;
Pasqual, Giulia ;
Targ, Sasha ;
Jacobsen, Johanne T. ;
Mano, Yasuko M. ;
Chen, Casie S. ;
Weill, Jean-Claude ;
Reynaud, Claude-Agnes ;
Browne, Edward P. ;
Meyer-Hermann, Michael ;
Victora, Gabriel D. .
SCIENCE, 2016, 351 (6277) :1048-1054
[42]   Estimating the number of clusters in a data set via the gap statistic [J].
Tibshirani, R ;
Walther, G ;
Hastie, T .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2001, 63 :411-423
[43]   Identification of Antigen-Specific B Cell Receptor Sequences Using Public Repertoire Analysis [J].
Trueck, Johannes ;
Ramasamy, Maheshi N. ;
Galson, Jacob D. ;
Rance, Richard ;
Parkhill, Julian ;
Lunter, Gerton ;
Pollard, Andrew J. ;
Kelly, Dominic F. .
JOURNAL OF IMMUNOLOGY, 2015, 194 (01) :252-261
[44]  
Tsioris K, 2015, INTEGR BIOL-UK, V7, P1587, DOI [10.1039/c5ib00169b, 10.1039/C5IB00169B]
[45]   Design and standardization of PCR primers and protocols for detection of clonal immunoglobulin and T-cell receptor gene recombinations in suspect lymphoproliferations:: Report of the BIOMED-2 Concerted Action BMH4-CT98-3936 [J].
van Dongen, JJM ;
Langerak, AW ;
Brüggemann, M ;
Evans, PAS ;
Hummel, M ;
Lavender, FL ;
Delabesse, E ;
Davi, F ;
Schuuring, E ;
García-Sanz, R ;
van Krieken, JHJM ;
Droese, J ;
González, D ;
Bastard, C ;
White, HE ;
Spaargaren, M ;
González, M ;
Parreira, A ;
Smith, JL ;
Morgan, GJ ;
Kneba, M ;
Macintyre, EA .
LEUKEMIA, 2003, 17 (12) :2257-2317
[46]   Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing [J].
Vander Heiden, Jason A. ;
Stathopoulos, Panos ;
Zhou, Julian Q. ;
Chen, Luan ;
Gilbert, Tamara J. ;
Bolen, Christopher R. ;
Barohn, Richard J. ;
Dimachkie, Mazen M. ;
Ciafaloni, Emma ;
Broering, Teresa J. ;
Vigneault, Francois ;
Nowak, Richard J. ;
Kleinstein, Steven H. ;
O'Connor, Kevin C. .
JOURNAL OF IMMUNOLOGY, 2017, 198 (04) :1460-1473
[47]   pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires [J].
Vander Heiden, Jason A. ;
Yaari, Gur ;
Uduman, Mohamed ;
Stern, Joel N. H. ;
O'Connor, Kevin C. ;
Hafler, David A. ;
Vigneault, Francois ;
Kleinstein, Steven H. .
BIOINFORMATICS, 2014, 30 (13) :1930-1932
[48]  
Volpe Joseph M, 2008, Immunome Res, V4, P3, DOI 10.1186/1745-7580-4-3
[49]   Effects of Aging, Cytomegalovirus Infection, and EBV Infection on Human B Cell Repertoires [J].
Wang, Chen ;
Liu, Yi ;
Xu, Lan T. ;
Jackson, Katherine J. L. ;
Roskin, Krishna M. ;
Pham, Tho D. ;
Laserson, Jonathan ;
Marshall, Eleanor L. ;
Seo, Katie ;
Lee, Ji-Yeun ;
Furman, David ;
Koller, Daphne ;
Dekker, Cornelia L. ;
Davis, Mark M. ;
Fire, Andrew Z. ;
Boyd, Scott D. .
JOURNAL OF IMMUNOLOGY, 2014, 192 (02) :603-611
[50]   High-Throughput Sequencing of the Zebrafish Antibody Repertoire [J].
Weinstein, Joshua A. ;
Jiang, Ning ;
White, Richard A., III ;
Fisher, Daniel S. ;
Quake, Stephen R. .
SCIENCE, 2009, 324 (5928) :807-810