Statistical inference of the generation probability of T-cell receptors from sequence repertoires

被引:203
作者
Murugan, Anand [1 ]
Mora, Thierry [2 ,3 ]
Walczak, Aleksandra M. [3 ,4 ]
Callan, Curtis G., Jr. [1 ,5 ]
机构
[1] Princeton Univ, Joseph Henry Labs, Princeton, NJ 08544 USA
[2] CNRS, Lab Phys Stat, UMR8550, F-75005 Paris, France
[3] Ecole Normale Super, F-75005 Paris, France
[4] CNRS, Phys Theor Lab, UMR8549, F-75005 Paris, France
[5] Simons Ctr Syst Biol, Inst Adv Study, Princeton, NJ 08544 USA
基金
美国国家科学基金会;
关键词
convergent recombination; expectation maximization; palindromic nucleotides; insertion/deletion profiles; V(D)J RECOMBINATION; BREAK REPAIR; DIVERSITY;
D O I
10.1073/pnas.1212755109
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
引用
收藏
页码:16161 / 16166
页数:6
相关论文
共 25 条
[1]  
[Anonymous], 2008, EM ALGORITHM EXTENSI
[2]   A direct estimate of the human αβ T cell receptor diversity [J].
Arstila, TP ;
Casrouge, A ;
Baron, V ;
Even, J ;
Kanellopoulos, J ;
Kourilsky, P .
SCIENCE, 1999, 286 (5441) :958-961
[3]   Lymphocyte numbers and subsets in the human blood - Do they mirror the situation in all organs? [J].
Blum, Katrin S. ;
Pabst, Reinhard .
IMMUNOLOGY LETTERS, 2007, 108 (01) :45-51
[4]   Most α/β T cell receptor diversity is due to terminal deoxynucleotidyl transferase [J].
Cabaniols, JP ;
Fazilleau, N ;
Casrouge, A ;
Kourilsky, P ;
Kanellopoulos, JM .
JOURNAL OF EXPERIMENTAL MEDICINE, 2001, 194 (09) :1385-1390
[5]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[6]   Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing [J].
Freeman, J. Douglas ;
Warren, Rene L. ;
Webb, John R. ;
Nelson, Brad H. ;
Holt, Robert A. .
GENOME RESEARCH, 2009, 19 (10) :1817-1824
[7]  
Gauss GH, 1996, MOL CELL BIOL, V16, P258
[8]   JUNCTIONAL SEQUENCES OF T-CELL RECEPTOR GAMMA-DELTA-GENES - IMPLICATIONS FOR GAMMA-DELTA-T-CELL LINEAGES AND FOR A NOVEL INTERMEDIATE OF V-(D)-J JOINING [J].
LAFAILLE, JJ ;
DECLOUX, A ;
BONNEVILLE, M ;
TAKAGAKI, Y ;
TONEGAWA, S .
CELL, 1989, 59 (05) :859-870
[9]   SnapShot: Nonhomologous DNA End Joining (NHEJ) [J].
Lieber, Michael R. ;
Wilson, Thomas E. .
CELL, 2010, 142 (03) :496-U174
[10]   The Mechanism of Double-Strand DNA Break Repair by the Nonhomologous DNA End-Joining Pathway [J].
Lieber, Michael R. .
ANNUAL REVIEW OF BIOCHEMISTRY, VOL 79, 2010, 79 :181-211