共 47 条
OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs
被引:121
作者:
Sethna, Zachary
[1
]
Elhanati, Yuval
[1
]
Callan, Curtis G., Jr.
[1
,2
]
Walczak, Aleksandra M.
[2
]
Mora, Thierry
[2
]
机构:
[1] Princeton Univ, Joseph Henry Labs, Princeton, NJ 08544 USA
[2] Univ Paris Diderot, Sorbonne Univ, Lab Phys, CNRS,Ecole Normale Super,PSL Univ, F-75005 Paris, France
基金:
欧洲研究理事会;
关键词:
ANKYLOSING-SPONDYLITIS;
DIVERSITY;
TCR;
SPECIFICITY;
FREQUENCY;
D O I:
10.1093/bioinformatics/btz035
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Motivation: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem. Results: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design.
引用
收藏
页码:2974 / 2981
页数:8
相关论文