Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information

被引：291

作者：

Ahmad, S ^{[1
]}

Gromiha, MM

Sarai, A

机构：

[1] Kyushu Inst Technol, Dept Biochem Sci & Engn, Iizuka, Fukuoka 8208502, Japan

[2] Jamia Millia Islamia, Dept Biosci, New Delhi 110025, India

[3] AIST, Computat Biol Res Ctr, CBRC, Koto Ku, Tokyo 1350064, Japan

来源：

BIOINFORMATICS | 2004年 / 20卷 / 04期

关键词：

D O I：

10.1093/bioinformatics/btg432

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: Though vitally important to cell function, the mechanism of protein-DNA binding has not yet been completely understood. We therefore analysed the relationship between DNA binding and protein sequence composition, solvent accessibility and secondary structure. Using non-redundant databases of transcription factors and protein-DNA complexes, neural network models were developed to utilize the information present in this relationship to predict DNA-binding proteins and their binding residues. Results: Sequence composition was found to provide sufficient information to predict the probability of its binding to DNA with nearly 69% sensitivity at 64% accuracy for the considered proteins; sequence neighbourhood and solvent accessibility information were sufficient to make binding site predictions with 40% sensitivity at 79% accuracy. Detailed analysis of binding residues shows that some three- and five-residue segments frequently bind to DNA and that solvent accessibility plays a major role in binding. Although, binding behaviour was not associated with any particular secondary structure, there were interesting exceptions at the residue level. Over-representation of some residues in the binding sites was largely lost at the total sequence level, but a different kind of compositional preference was observed in DNA-binding proteins.

引用

页码：477 / 486

页数：10

共 20 条

[1] Real value prediction of solvent accessibility from amino acid sequence [J].

Ahmad, S ;

Gromiha, MM ;

Sarai, A .

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 50 (04) :629-635

[2] NETASA: neural network based prediction of solvent accessibility [J].

Ahmad, S ;

Gromiha, MM .

BIOINFORMATICS, 2002, 18 (06) :819-824

[3] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].

Altschul, SF ;

Madden, TL ;

Schaffer, AA ;

Zhang, JH ;

Zhang, Z ;

Miller, W ;

Lipman, DJ .

NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402

[4] The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].

Boeckmann, B ;

Bairoch, A ;

Apweiler, R ;

Blatter, MC ;

Estreicher, A ;

Gasteiger, E ;

Martin, MJ ;

Michoud, K ;

O'Donovan, C ;

Phan, I ;

Pilbout, S ;

Schneider, M .

NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370

[5]

Cuff JA, 2000, PROTEINS, V40, P502, DOI 10.1002/1097-0134(20000815)40:3<502::AID-PROT170>3.0.CO

[6]

2-Q

[7] Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: Application of long-range order to folding rate prediction [J].

Gromiha, MM ;

Selvaraj, S .

JOURNAL OF MOLECULAR BIOLOGY, 2001, 310 (01) :27-32

[8] Role of structural and sequence information in the prediction of protein stability changes: comparison between buried and partially buried mutations [J].

Gromiha, MM ;

Oobatake, M ;

Kono, H ;

Uedaira, H ;

Sarai, A .

PROTEIN ENGINEERING, 1999, 12 (07) :549-555

[9] Removing near-neighbour redundancy from large protein sequence collections [J].

Holm, L ;

Sander, C .

BIOINFORMATICS, 1998, 14 (05) :423-429

[10] DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES [J].

KABSCH, W ;

SANDER, C .

BIOPOLYMERS, 1983, 22 (12) :2577-2637

← 1 2 →