Maximum-Entropy Models of Sequenced Immune Repertoires Predict Antigen-Antibody Affinity

被引:23
作者
Asti, Lorenzo [1 ,2 ]
Uguzzoni, Guido [2 ,3 ,4 ]
Marcatili, Paolo [5 ]
Pagnani, Andrea [2 ,6 ]
机构
[1] Univ Roma La Sapienza, Dipartimento Sci Base & Applicate Ingn, Piazzale Aldo Moro 5, I-00185 Rome, Italy
[2] Ctr Mol Biotechnol, Human Genet Fdn, Turin, Italy
[3] Univ Paris 04, UPMC, UMR 7238, Computat & Quantitat Biol, 15 Rue Ecole Med BC 1540, F-75006 Paris, France
[4] Univ Parma, Dipartimento Fis, I-43100 Parma, Italy
[5] Tech Univ Denmark, Dept Syst Biol, Ctr Biol Sequence Anal, DK-2800 Lyngby, Denmark
[6] Politecn Torino, Dept Appl Sci & Technol DISAT, Turin, Italy
关键词
WEB SERVER; SIGNATURES;
D O I
10.1371/journal.pcbi.1004870
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The immune system has developed a number of distinct complex mechanisms to shape and control the antibody repertoire. One of these mechanisms, the affinity maturation process, works in an evolutionary-like fashion: after binding to a foreign molecule, the antibody-producing B-cells exhibit a high-frequency mutation rate in the genome region that codes for the antibody active site. Eventually, cells that produce antibodies with higher affinity for their cognate antigen are selected and clonally expanded. Here, we propose a new statistical approach based on maximum entropy modeling in which a scoring function related to the binding affinity of antibodies against a specific antigen is inferred from a sample of sequences of the immune repertoire of an individual. We use our inference strategy to infer a statistical model on a data set obtained by sequencing a fairly large portion of the immune repertoire of an HIV-1 infected patient. The Pearson correlation coefficient between our scoring function and the IC50 neutralization titer measured on 30 different antibodies of known sequence is as high as 0.77 (p-value 10(-6)), outperforming other sequence-and structure-based models.
引用
收藏
页数:20
相关论文
共 40 条
[1]  
[Anonymous], ARXIV14076888
[2]  
[Anonymous], BIOINFORMATICS
[3]  
[Anonymous], NUCL ACIDS RES
[4]  
[Anonymous], NUCL ACIDS RES
[5]   Clustering with shallow trees [J].
Bailly-Bechet, M. ;
Bradde, S. ;
Braunstein, A. ;
Flaxman, A. ;
Foini, L. ;
Zecchina, R. .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2009,
[6]   Learning generative models for protein fold families [J].
Balakrishnan, Sivaraman ;
Kamisetty, Hetunandan ;
Carbonell, Jaime G. ;
Lee, Su-In ;
Langmead, Christopher James .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2011, 79 (04) :1061-1078
[7]   Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners [J].
Baldassi, Carlo ;
Zamparo, Marco ;
Feinauer, Christoph ;
Procaccini, Andrea ;
Zecchina, Riccardo ;
Weigt, Martin ;
Pagnani, Andrea .
PLOS ONE, 2014, 9 (03)
[8]   Functional heterogeneity of human memory CD4+ T cell clones primed by pathogens or vaccines [J].
Becattini, Simone ;
Latorre, Daniela ;
Mele, Federico ;
Foglierini, Mathilde ;
De Gregorio, Corinne ;
Cassotta, Antonino ;
Fernandez, Blanca ;
Kelderman, Sander ;
Schumacher, Ton N. ;
Corti, Davide ;
Lanzavecchia, Antonio ;
Sallusto, Federica .
SCIENCE, 2015, 347 (6220) :400-406
[9]   Rep-Seq: uncovering the immunological repertoire through next-generation sequencing [J].
Benichou, Jennifer ;
Ben-Hamo, Rotem ;
Louzoun, Yoram ;
Efroni, Sol .
IMMUNOLOGY, 2012, 135 (03) :183-191
[10]   Emerging methods in protein co-evolution [J].
de Juan, David ;
Pazos, Florencio ;
Valencia, Alfonso .
NATURE REVIEWS GENETICS, 2013, 14 (04) :249-261