Hierarchical Clustering Can Identify B Cell Clones with High Confidence in Ig Repertoire Sequencing Data

被引:92
作者
Gupta, Namita T. [1 ]
Adams, Kristofor D. [2 ]
Briggs, Adrian W. [2 ]
Timberlake, Sonia C. [2 ]
Vigneault, Francois [2 ]
Kleinstein, Steven H. [1 ,3 ,4 ]
机构
[1] Yale Univ, Interdept Program Computat Biol & Bioinfomat, New Haven, CT 06520 USA
[2] AbVitro, Boston, MA 02210 USA
[3] Yale Sch Med, Dept Immunol, New Haven, CT 06520 USA
[4] Yale Sch Med, Dept Pathol, New Haven, CT 06520 USA
基金
美国国家卫生研究院;
关键词
HIV-1-NEUTRALIZING ANTIBODIES; IMMUNOGLOBULIN; GENERATION; IDENTIFICATION; MATURATION; DIVERSITY; INFECTION; RITUXIMAB; SELECTION; TOOLKIT;
D O I
10.4049/jimmunol.1601850
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Adaptive immunity is driven by the expansion, somatic hypermutation, and selection of B cell clones. Each clone is the progeny of a single B cell responding to Ag, with diversified Ig receptors. These receptors can now be profiled on a large scale by nextgeneration sequencing. Such data provide a window into the microevolutionary dynamics that drive successful immune responses and the dysregulation that occurs with aging or disease. Clonal relationships are not directly measured, but they must be computationally inferred from these sequencing data. Although several hierarchical clustering-based methods have been proposed, they vary in distance and linkage methods and have not yet been rigorously compared. In this study, we use a combination of human experimental and simulated data to characterize the performance of hierarchical clustering-based methods for partitioning sequences into clones. We find that single linkage clustering has high performance, with specificity, sensitivity, and positive predictive value all > 99%, whereas other linkages result in a significant loss of sensitivity. Surprisingly, distance metrics that incorporate the biases of somatic hypermutation do not outperform simple Hamming distance. Although errors were more likely in sequences with short junctions, using the entire dataset to choose a single distance threshold for clustering is near optimal. Our results suggest that hierarchical clustering using single linkage with Hamming distance identifies clones with high confidence and provides a fully automated method for clonal grouping. The performance estimates we develop provide important context to interpret clonal analysis of repertoire sequencing data and allow for rigorous testing of other clonal grouping algorithms.
引用
收藏
页码:2489 / 2499
页数:11
相关论文
共 60 条
  • [41] Visualizing antibody affinity maturation in germinal centers
    Tas, Jeroen M. J.
    Mesin, Luka
    Pasqual, Giulia
    Targ, Sasha
    Jacobsen, Johanne T.
    Mano, Yasuko M.
    Chen, Casie S.
    Weill, Jean-Claude
    Reynaud, Claude-Agnes
    Browne, Edward P.
    Meyer-Hermann, Michael
    Victora, Gabriel D.
    [J]. SCIENCE, 2016, 351 (6277) : 1048 - 1054
  • [42] Estimating the number of clusters in a data set via the gap statistic
    Tibshirani, R
    Walther, G
    Hastie, T
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2001, 63 : 411 - 423
  • [43] Identification of Antigen-Specific B Cell Receptor Sequences Using Public Repertoire Analysis
    Trueck, Johannes
    Ramasamy, Maheshi N.
    Galson, Jacob D.
    Rance, Richard
    Parkhill, Julian
    Lunter, Gerton
    Pollard, Andrew J.
    Kelly, Dominic F.
    [J]. JOURNAL OF IMMUNOLOGY, 2015, 194 (01) : 252 - 261
  • [44] Tsioris K, 2015, INTEGR BIOL-UK, V7, P1587, DOI [10.1039/c5ib00169b, 10.1039/C5IB00169B]
  • [45] Design and standardization of PCR primers and protocols for detection of clonal immunoglobulin and T-cell receptor gene recombinations in suspect lymphoproliferations:: Report of the BIOMED-2 Concerted Action BMH4-CT98-3936
    van Dongen, JJM
    Langerak, AW
    Brüggemann, M
    Evans, PAS
    Hummel, M
    Lavender, FL
    Delabesse, E
    Davi, F
    Schuuring, E
    García-Sanz, R
    van Krieken, JHJM
    Droese, J
    González, D
    Bastard, C
    White, HE
    Spaargaren, M
    González, M
    Parreira, A
    Smith, JL
    Morgan, GJ
    Kneba, M
    Macintyre, EA
    [J]. LEUKEMIA, 2003, 17 (12) : 2257 - 2317
  • [46] Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing
    Vander Heiden, Jason A.
    Stathopoulos, Panos
    Zhou, Julian Q.
    Chen, Luan
    Gilbert, Tamara J.
    Bolen, Christopher R.
    Barohn, Richard J.
    Dimachkie, Mazen M.
    Ciafaloni, Emma
    Broering, Teresa J.
    Vigneault, Francois
    Nowak, Richard J.
    Kleinstein, Steven H.
    O'Connor, Kevin C.
    [J]. JOURNAL OF IMMUNOLOGY, 2017, 198 (04) : 1460 - 1473
  • [47] pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires
    Vander Heiden, Jason A.
    Yaari, Gur
    Uduman, Mohamed
    Stern, Joel N. H.
    O'Connor, Kevin C.
    Hafler, David A.
    Vigneault, Francois
    Kleinstein, Steven H.
    [J]. BIOINFORMATICS, 2014, 30 (13) : 1930 - 1932
  • [48] Volpe Joseph M, 2008, Immunome Res, V4, P3, DOI 10.1186/1745-7580-4-3
  • [49] Effects of Aging, Cytomegalovirus Infection, and EBV Infection on Human B Cell Repertoires
    Wang, Chen
    Liu, Yi
    Xu, Lan T.
    Jackson, Katherine J. L.
    Roskin, Krishna M.
    Pham, Tho D.
    Laserson, Jonathan
    Marshall, Eleanor L.
    Seo, Katie
    Lee, Ji-Yeun
    Furman, David
    Koller, Daphne
    Dekker, Cornelia L.
    Davis, Mark M.
    Fire, Andrew Z.
    Boyd, Scott D.
    [J]. JOURNAL OF IMMUNOLOGY, 2014, 192 (02) : 603 - 611
  • [50] High-Throughput Sequencing of the Zebrafish Antibody Repertoire
    Weinstein, Joshua A.
    Jiang, Ning
    White, Richard A., III
    Fisher, Daniel S.
    Quake, Stephen R.
    [J]. SCIENCE, 2009, 324 (5928) : 807 - 810