OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs

被引:122
作者
Sethna, Zachary [1 ]
Elhanati, Yuval [1 ]
Callan, Curtis G., Jr. [1 ,2 ]
Walczak, Aleksandra M. [2 ]
Mora, Thierry [2 ]
机构
[1] Princeton Univ, Joseph Henry Labs, Princeton, NJ 08544 USA
[2] Univ Paris Diderot, Sorbonne Univ, Lab Phys, CNRS,Ecole Normale Super,PSL Univ, F-75005 Paris, France
基金
欧洲研究理事会;
关键词
ANKYLOSING-SPONDYLITIS; DIVERSITY; TCR; SPECIFICITY; FREQUENCY;
D O I
10.1093/bioinformatics/btz035
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem. Results: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design.
引用
收藏
页码:2974 / 2981
页数:8
相关论文
共 47 条
[1]  
[Anonymous], BIORXIV, DOI [10. 1101/145052, DOI 10.1101/145052]
[2]   Functional heterogeneity of human memory CD4+ T cell clones primed by pathogens or vaccines [J].
Becattini, Simone ;
Latorre, Daniela ;
Mele, Federico ;
Foglierini, Mathilde ;
De Gregorio, Corinne ;
Cassotta, Antonino ;
Fernandez, Blanca ;
Kelderman, Sander ;
Schumacher, Ton N. ;
Corti, Davide ;
Lanzavecchia, Antonio ;
Sallusto, Federica .
SCIENCE, 2015, 347 (6220) :400-406
[3]   Quantifiable predictive features define epitope-specific T cell receptor repertoires [J].
Dash, Pradyot ;
Fiore-Gartland, Andrew J. ;
Hertz, Tomer ;
Wang, George C. ;
Sharma, Shalini ;
Souquette, Aisha ;
Crawford, Jeremy Chase ;
Clemens, E. Bridie ;
Nguyen, Thi H. O. ;
Kedzierska, Katherine ;
La Gruta, Nicole L. ;
Bradley, Philip ;
Thomas, Paul G. .
NATURE, 2017, 547 (7661) :89-+
[4]   A Public Database of Memory and Naive B-Cell Receptor Sequences [J].
DeWitt, William S. ;
Lindau, Paul ;
Snyder, Thomas M. ;
Sherwood, Anna M. ;
Vignali, Marissa ;
Carlson, Christopher S. ;
Greenberg, Philip D. ;
Duerkopp, Natalie ;
Emerson, Ryan O. ;
Robins, Harlan S. .
PLOS ONE, 2016, 11 (08)
[5]   Human T cell receptor occurrence patterns encode immune history, genetic background, an receptor specificity [J].
Dewitt, William S., III ;
Smith, Anajane ;
Schoch, Gary ;
Hansen, John A. ;
Matsen, Frederick A. ;
Bradley, Philip .
ELIFE, 2018, 7
[6]  
Dupic T., 2018, ARXIV180611030
[7]   Predicting the spectrum of TCR repertoire sharing with a data-driven model of recombination [J].
Elhanati, Yuval ;
Sethna, Zachary ;
Callan, Curtis G., Jr. ;
Mora, Thierry ;
Walczak, Aleksandra M. .
IMMUNOLOGICAL REVIEWS, 2018, 284 (01) :167-179
[8]   repgenHMM: a dynamic programming tool to infer the rules of immune receptor generation from sequence data [J].
Elhanati, Yuval ;
Marcou, Quentin ;
Mora, Thierry ;
Walczak, Aleksandra M. .
BIOINFORMATICS, 2016, 32 (13) :1943-1951
[9]   Inferring processes underlying B-cell repertoire diversity [J].
Elhanati, Yuval ;
Sethna, Zachary ;
Marcou, Quentin ;
Callan, Curtis G., Jr. ;
Mora, Thierry ;
Walczak, Aleksandra M. .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2015, 370 (1676)
[10]   Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire [J].
Emerson, Ryan O. ;
DeWitt, William S. ;
Vignali, Marissa ;
Gravley, Jenna ;
Hu, Joyce K. ;
Osborne, Edward J. ;
Desmarais, Cindy ;
Klinger, Mark ;
Carlson, Christopher S. ;
Hansen, John A. ;
Rieder, Mark ;
Robins, Harlan S. .
NATURE GENETICS, 2017, 49 (05) :659-+