CNN-Pred: Prediction of single-stranded and double-stranded DNA-binding protein using convolutional neural networks

被引:11
|
作者
Manavi, Farnoush [1 ]
Sharma, Alok [2 ,3 ]
Sharma, Ronesh [4 ]
Tsunoda, Tatsuhiko [2 ,5 ,9 ]
Shatabda, Swakkhar [6 ]
Dehzangi, Iman [7 ,8 ]
机构
[1] Shiraz Univ, Comp Sci & Engn & Informat Technol Dept, Shiraz, Iran
[2] RIKEN Ctr Integrat Med Sci, Lab Med Sci Math, Yokohama 2300045, Japan
[3] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld 4111, Australia
[4] Fiji Natl Univ, Sch Elect & Elect Engn, Suva, Fiji
[5] Univ Tokyo, Sch Sci, Dept Biol Sci, Lab Med Sci Math, Tokyo 1130033, Japan
[6] United Int Univ, Dept Comp Sci & Engn, Dhaka, Bangladesh
[7] Rutgers State Univ, Dept Comp Sci, Camden, NJ 08102 USA
[8] Rutgers State Univ, Ctr Computat & Integrat Biol, Camden, NJ USA
[9] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Lab Med Sci Math, Tokyo 1130033, Japan
关键词
Convolutional neural networks; DNA -Binding proteins; SSBs; DSBs; Feature; SCORING MATRIX; SITES;
D O I
10.1016/j.gene.2022.147045
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
DNA-binding proteins play a vital role in biological activity including DNA replication, DNA packing, and DNA reparation. DNA-binding proteins can be classified into single-stranded DNA-binding proteins (SSBs) or doublestranded DNA-binding proteins (DSBs). Determining whether a protein is DSB or SSB helps determine the protein's function. Therefore, many studies have been conducted to accurately identify DSB and SSB in recent years. Despite all the efforts have been made so far, the DSB and SSB prediction performance remains limited. In this study, we propose a new method called CNN-Pred to accurately predict DSB and SSB. To build CNN-Pred, we first extract evolutionary-based features in the form of mono-gram and bi-gram profiles using position specific scoring matrix (PSSM). We then, use 1D-convolutional neural network (CNN) as the classifier to our extracted features. Our results demonstrate that CNN-Pred can enhance the DSB and SSB prediction accuracies by more than 4%, on the independent test compared to previous studies found in the literature. CNN-pred as a standalone tool and all its source codes are publicly available at: https://github.com/MLBC-lab/CNN-Pred.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Single-stranded and double-stranded DNA-binding protein prediction using HMM profiles
    Sharma, Ronesh
    Kumar, Shiu
    Tsunoda, Tatsuhiko
    Kumarevel, Thirumananseri
    Sharma, Alok
    ANALYTICAL BIOCHEMISTRY, 2021, 612
  • [2] Analysis and classification of DNA-binding sites in single-stranded and double-stranded DNA-binding proteins using protein information
    Wang, Wei
    Liu, Juan
    Xiong, Yi
    Zhu, Lida
    Zhou, Xionghui
    IET SYSTEMS BIOLOGY, 2014, 8 (04) : 176 - 183
  • [3] CD of single-stranded, double-stranded, and G-quartet nucleic acids in complexes with a single-stranded DNA-binding protein
    Gray, DM
    Gray, CW
    Mou, TC
    Wen, JD
    ENANTIOMER, 2002, 7 (2-3): : 49 - 58
  • [4] Analysis and prediction of single-stranded and double-stranded DNA binding proteins based on protein sequences
    Wei Wang
    Lin Sun
    Shiguang Zhang
    Hongjun Zhang
    Jinling Shi
    Tianhe Xu
    Keliang Li
    BMC Bioinformatics, 18
  • [5] Analysis and prediction of single-stranded and double-stranded DNA binding proteins based on protein sequences
    Wang, Wei
    Sun, Lin
    Zhang, Shiguang
    Zhang, Hongjun
    Shi, Jinling
    Xu, Tianhe
    Li, Keliang
    BMC BIOINFORMATICS, 2017, 18
  • [6] PredPSD: A Gradient Tree Boosting Approach for Single-Stranded and Double-Stranded DNA Binding Protein Prediction
    Tan, Changgeng
    Wang, Tong
    Yang, Wenyi
    Deng, Lei
    MOLECULES, 2020, 25 (01):
  • [7] SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM
    Ali, Farman
    Arif, Muhammad
    Khan, Zaheer Ullah
    Kabir, Muhammad
    Ahmed, Saeed
    Yu, Dong-Jun
    ANALYTICAL BIOCHEMISTRY, 2020, 589
  • [8] BINDING OF THE RECA PROTEIN OF ESCHERICHIA-COLI TO SINGLE-STRANDED AND DOUBLE-STRANDED DNA
    MCENTEE, K
    WEINSTOCK, GM
    LEHMAN, IR
    JOURNAL OF BIOLOGICAL CHEMISTRY, 1981, 256 (16) : 8835 - 8844
  • [9] Identification of single-stranded and double-stranded dna binding proteins based on protein structure
    Wei Wang
    Juan Liu
    Xionghui Zhou
    BMC Bioinformatics, 15
  • [10] Identification of single-stranded and double-stranded dna binding proteins based on protein structure
    Wang, Wei
    Liu, Juan
    Zhou, Xionghui
    BMC BIOINFORMATICS, 2014, 15