MsDBP: Exploring DNA-Binding Proteins by Integrating Multiscale Sequence Information via Chou's Five-Step Rule

被引:55
作者
Du, Xiuquan [1 ]
Diao, Yanyu [1 ]
Liu, Heng [2 ]
Li, Shuo [3 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei, Anhui, Peoples R China
[2] Anhui Med Univ, Affiliated Hosp 1, Dept Gastroenterol, Hefei, Anhui, Peoples R China
[3] Western Univ, Dept Med Imaging, London, ON N6A 3K7, Canada
关键词
DNA-binding proteins; multiscale features; dense layers; AMINO-ACID-COMPOSITION; PREDICT SUBCELLULAR-LOCALIZATION; LYSINE SUCCINYLATION SITES; CRITICAL SPHERICAL-SHELL; FLEXIBLE WEB SERVER; K-TUPLE; ENSEMBLE CLASSIFIER; RECOMBINATION SPOTS; PSEUDO COMPONENTS; TUMOR-SUPPRESSOR;
D O I
10.1021/acs.jproteome.9b00226
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
DNA-binding proteins are crucial to alternative splicing, methylation, and the structural composition of the DNA. The existing experimental methods for identifying DNA-binding proteins are expensive and time-consuming; thus, it is necessary to develop a fast and accurate computational method to address the problem. In this Article, we report a novel predictor MsDBP, a DNA-binding protein prediction method that combines the multiscale sequence feature into a deep neural network. First of all, instead of developing a narrow-application structured-based method, we are committed to a sequenced-based predictor. Second, instead of characterizing the whole protein directly, we divide the protein into subsequences with different lengths and then encode them into a vector based on composition information. In this way, the multiscale sequence feature can be obtained. Finally, a branch of dense layers is applied for learning multilevel abstract to discriminate DNA-binding proteins. When MsDBP is tested on the independent data set PDB2272, it achieves an overall accuracy of 66.99% with the SE of 70.69%. In addition, we also perform extensive experiments to compare the proposed method with other existing methods. The results indicate that MsDBP would be a useful tool for the identification of DNA-binding proteins. MsDBP is freely available at a web server on http://47.100.203.218/MsDBP/.
引用
收藏
页码:3119 / 3132
页数:14
相关论文
共 154 条
[1]   MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components [J].
Ahmad, Jamal ;
Hayat, Maqsood .
JOURNAL OF THEORETICAL BIOLOGY, 2019, 463 :99-109
[2]   iMethyl-STTNC: Identification of N6-methyladenosine sites by extending the idea of SAAC into Chou's PseAAC to formulate RNA sequences [J].
Akbar, Shahid ;
Hayat, Maqsood .
JOURNAL OF THEORETICAL BIOLOGY, 2018, 455 :205-211
[3]  
ALTHAUS IW, 1993, J BIOL CHEM, V268, P14875
[4]  
ALTHAUS IW, 1993, J BIOL CHEM, V268, P6119
[5]   KINETIC-STUDIES WITH THE NONNUCLEOSIDE HIV-1 REVERSE-TRANSCRIPTASE INHIBITOR-U-88204E [J].
ALTHAUS, IW ;
CHOU, JJ ;
GONZALES, AJ ;
DEIBEL, MR ;
CHOU, KC ;
KEZDY, FJ ;
ROMERO, DL ;
PALMER, JR ;
THOMAS, RC ;
ARISTOFF, PA ;
TARPLEY, WG ;
REUSSER, F .
BIOCHEMISTRY, 1993, 32 (26) :6548-6554
[6]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[7]  
[Anonymous], 2014, PLOS ONE
[8]   Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[9]  
Ariazi EA, 2002, CANCER RES, V62, P6510
[10]   Subharmonic solutions with prescribed minimal period for a class of second order impulsive systems [J].
Bai, Liang ;
Wang, Xiaoyun .
ELECTRONIC JOURNAL OF QUALITATIVE THEORY OF DIFFERENTIAL EQUATIONS, 2017, (54) :1-11