A bioinformatics pipeline to build a knowledge database for in silico antibody engineering

被引:1
|
作者
Zhao, Shanrong [1 ]
Lu, Jin [2 ]
机构
[1] Johnson & Johnson Pharmaceut Res & Dev LLC, Silico Immunol, San Diego, CA 92121 USA
[2] Centocor R&D Inc, Biopharmaceut Res, Radnor, PA 19087 USA
关键词
Germline; Hypermutation; Somatic mutation; Position specific scoring matrix; Knowledge database; Antibody engineering; Algorithm; !text type='Java']Java[!/text; Bioinformatics; Web; CDR-IMGT; CLASS-SWITCH RECOMBINATION; T-CELL-RECEPTORS; SOMATIC HYPERMUTATION; IMMUNOGLOBULIN GENES; VARIABLE DOMAINS; INSERTIONS; DIVERSITY; DELETIONS; NOMENCLATURE;
D O I
10.1016/j.molimm.2011.01.009
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A challenge to antibody engineering is the large number of positions and nature of variation and opposing concerns of introducing unfavorable biochemical properties. While large libraries are quite successful in identifying antibodies with improved binding or activity, still only a fraction of possibilities can be explored and that would require considerable effort. The vast array of natural antibody sequences provides a potential wealth of information on (1) selecting hotspots for variation, and (2) designing mutants to mimic natural variations seen in hotspots. The human immune system can generate an enormous diversity of immunoglobulins against an almost unlimited range of antigens by gene rearrangement of a limited number of germline variable, diversity and joining genes followed by somatic hypermutation and antigen selection. All the antibody sequences in NCBI database can be assigned to different germline genes. As a result, a position specific scoring matrix for each germline gene can be constructed by aligning all its member sequences and calculating the amino acid frequencies for each position. The position specific scoring matrix for each germline gene characterizes "hotspots" and the nature of variations, and thus reduces the sequence space of exploration in antibody engineering. We have developed a bioinformatics pipeline to conduct analysis of human antibody sequences, and generated a comprehensive knowledge database for in silico antibody engineering. The pipeline is fully automatic and the knowledge database can be refreshed anytime by re-running the pipeline. The refresh process is fast, typically taking 1 min on a Lenovo ThinkPad T60 laptop with 3G memory. Our knowledge database consists of (1) the individual germline gene usage in generation of natural antibodies; (2) the CDR length distributions; and (3) the position specific scoring matrix for each germline gene. The knowledge database provides comprehensive support for antibody engineering, including de novo library design in selection of favorable germline V gene scaffolds and CDR lengths. In addition, we have also developed a web application framework to present our knowledge database, and the web interface can help people to easily retrieve a variety of information from the knowledge database. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1019 / 1026
页数:8
相关论文
共 45 条
  • [1] Bioinformatics: A knowledge engineering approach
    Kasabov, N
    2004 2ND INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 19 - 24
  • [2] BIKMAS:: A knowledge engineering system for bioinformatics
    López-Alonso, V
    Moreno, L
    López-Campos, G
    Maojo, V
    Martín-Sanchez, F
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2002, 2002, 2412 : 435 - 440
  • [3] Computational intelligence for bioinformatics: The knowledge engineering approach
    Kasabov, N
    Research and Development in Intelligent Systems XXII, 2006, : 3 - 4
  • [4] Engineering method to build the composite structure ply database
    Shi, Qinghua
    Zhao, Shiwei
    Results in Physics, 2016, 6 : 434 - 439
  • [5] Bio2RDF: Towards a mashup to build bioinformatics knowledge systems
    Belleau, Francois
    Nolin, Marc-Alexandre
    Tourigny, Nicole
    Rigault, Philippe
    Morissette, Jean
    JOURNAL OF BIOMEDICAL INFORMATICS, 2008, 41 (05) : 706 - 716
  • [6] in silico Mutator Software for Bioinformatics Pipeline Validation of Clinical Next-Generation Sequencing Assays
    Patil, Sushant A.
    Mujacic, Ibro
    Ritterhouse, Lauren L.
    Segal, Jeremy P.
    Kadri, Sabah
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2019, 21 (01): : 19 - 26
  • [7] Engineering Knowledge Graph From Patent Database
    Siddharth, L.
    Blessing, Lucienne T. M.
    Wood, Kristin L.
    Luo, Jianxi
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2022, 22 (02)
  • [8] Knowledge engineering for a database between botany and art
    Giacomini, M
    Bruzzo, L
    Bertone, S
    Ruggiero, C
    HEALTH DATA IN THE INFORMATION SOCIETY, 2002, 90 : 251 - 255
  • [9] Framed knowledge base in engineering database thesaurus
    Takechi, H
    Takahashi, Y
    Inoue, K
    JSME INTERNATIONAL JOURNAL SERIES C-MECHANICAL SYSTEMS MACHINE ELEMENTS AND MANUFACTURING, 2000, 43 (01): : 190 - 197
  • [10] A knowledge based database system for engineering correlations
    Moss, MA
    Jambunathan, K
    Lai, E
    ARTIFICIAL INTELLIGENCE IN ENGINEERING, 1999, 13 (03): : 201 - 210