Maximum entropy models for antibody diversity

被引:234
作者
Mora, Thierry [1 ,2 ]
Walczak, Aleksandra M. [1 ,2 ,3 ]
Bialek, William [1 ,2 ,3 ]
Callan, Curtis G., Jr. [1 ,2 ,3 ]
机构
[1] Princeton Univ, Joseph Henry Labs Phys, Princeton, NJ 08544 USA
[2] Princeton Univ, Lewis Sigler Inst Integrat Genom, Princeton, NJ 08544 USA
[3] Princeton Univ, Princeton Ctr Theoret Sci, Princeton, NJ 08544 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
D regions; immune receptor proteins; statistical models; INFORMATION-THEORY; PROTEIN;
D O I
10.1073/pnas.1001705107
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recognition of pathogens relies on families of proteins showing great diversity. Here we construct maximum entropy models of the sequence repertoire, building on recent experiments that provide a nearly exhaustive sampling of the IgM sequences in zebrafish. These models are based solely on pairwise correlations between residue positions but correctly capture the higher order statistical properties of the repertoire. By exploiting the interpretation of these models as statistical physics problems, we make several predictions for the collective properties of the sequence ensemble: The distribution of sequences obeys Zipf's law, the repertoire decomposes into several clusters, and there is a massive restriction of diversity because of the correlations. These predictions are completely inconsistent with models in which amino acid substitutions are made independently at each site and are in good agreement with the data. Our results suggest that antibody diversity is not limited by the sequences encoded in the genome and may reflect rapid adaptation to antigenic challenges. This approach should be applicable to the study of the global properties of other protein families.
引用
收藏
页码:5405 / 5410
页数:6
相关论文
共 27 条
  • [1] [Anonymous], 2007, ARXIV07124397
  • [2] [Anonymous], 2006, ARXIVQBIO0611072
  • [3] Branden C., 1999, INTRO PROTEIN STRUCT
  • [4] Sequence space, folding and protein design
    Cordes, MHJ
    Davidson, AR
    Sauer, RT
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (01) : 3 - 10
  • [5] Maximum-entropy network analysis reveals a role for tumor necrosis factor in peripheral nerve development and function
    Dhadialla, Prabhjot S.
    Ohiorhenuan, Ifije E.
    Cohen, Andrew
    Strickland, Sidney
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (30) : 12494 - 12499
  • [6] Protein Sectors: Evolutionary Units of Three-Dimensional Structure
    Halabi, Najeeb
    Rivoire, Olivier
    Leibler, Stanislas
    Ranganathan, Rama
    [J]. CELL, 2009, 138 (04) : 774 - 786
  • [7] HANGGI P, 1990, REV MOD PHYS, V62, P251, DOI 10.1103/RevModPhys.62.251
  • [8] EVIDENCE FOR SOMATIC REARRANGEMENT OF IMMUNOGLOBULIN GENES CODING FOR VARIABLE AND CONSTANT REGIONS
    HOZUMI, N
    TONEGAWA, S
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1976, 73 (10) : 3628 - 3632
  • [9] Huang K., 2008, Statistical Mechanics
  • [10] INFORMATION THEORY AND STATISTICAL MECHANICS
    JAYNES, ET
    [J]. PHYSICAL REVIEW, 1957, 106 (04): : 620 - 630