Transcription Factor Binding Site Prediction Using CnNet Approach

被引:0
作者
Masood, M. Mohamed Divan [1 ]
Manjula, D. [2 ]
Sugumaran, Vijayan [3 ,4 ]
机构
[1] BS Abdur Rahman Crescent Inst Sci & Technol, Dept Comp Sci & Engn, Chennai 600048, India
[2] Vellore Inst Technol, Dept Comp Sci & Engn, Chennai 600127, India
[3] Oakland Univ, Dept Decis & Informat Sci, Rochester, MI 48309 USA
[4] Oakland Univ, Ctr Data Sci & Big Data Analyt, Rochester, MI 48309 USA
关键词
DNA; Hidden Markov models; Gene expression; Pulse width modulation; Proteins; Probes; Genetics; Motif discovery; transcription factor (TF) binding site; convolution neural network (CNN); multiple expression motifs for motif elicitation (MEME); sequence specificity; MOTIF DISCOVERY; GENE-EXPRESSION; DNA; SEQUENCE; ALGORITHM; STRATEGY; SEARCH;
D O I
10.1109/TCBB.2024.3411024
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Controlling the gene expression is the most important development in a living organism, which makes it easier to find different kinds of diseases and their causes. It's very difficult to know what factors control the gene expression. Transcription Factor (TF) is a protein that plays an important role in gene expression. Discovering the transcription factor has immense biological significance, however, it is challenging to develop novel techniques and evaluation for regulatory developments in biological structures. In this research, we mainly focus on 'sequence specificities' that can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for predicting transcription factor binding. Specifically, Multiple Expression motifs for Motif Elicitation (MEME) technique with Convolution Neural Network (CNN) named as CnNet, has been used for discovering the 'sequence specificities' of DNA gene sequences dataset. This process involves two steps: a) discovering the motifs that are capable of identifying useful TF binding site by using MEME technique, and b) computing a score indicating the likelihood of a given sequence being a useful binding site by using CNN technique. The proposed CnNet approach predicts the TF binding score with much better accuracy compared to existing approaches.
引用
收藏
页码:1721 / 1730
页数:10
相关论文
共 53 条
  • [1] Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
    Alipanahi, Babak
    Delong, Andrew
    Weirauch, Matthew T.
    Frey, Brendan J.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (08) : 831 - +
  • [2] A Linear Model for Transcription Factor Binding Affinity Prediction in Protein Binding Microarrays
    Annala, Matti
    Laurila, Kirsti
    Lahdesmaki, Harri
    Nykter, Matti
    [J]. PLOS ONE, 2011, 6 (05):
  • [3] Modeling binding specificities of transcription factor pairs with random forests
    Antikainen, Anni A.
    Heinonen, Markus
    Lahdesmaki, Harri
    [J]. BMC BIOINFORMATICS, 2022, 23 (01)
  • [4] Diversity and Complexity in DNA Recognition by Transcription Factors
    Badis, Gwenael
    Berger, Michael F.
    Philippakis, Anthony A.
    Talukder, Shaheynoor
    Gehrke, Andrew R.
    Jaeger, Savina A.
    Chan, Esther T.
    Metzler, Genita
    Vedenko, Anastasia
    Chen, Xiaoyu
    Kuznetsov, Hanna
    Wang, Chi-Fong
    Coburn, David
    Newburger, Daniel E.
    Morris, Quaid
    Hughes, Timothy R.
    Bulyk, Martha L.
    [J]. SCIENCE, 2009, 324 (5935) : 1720 - 1723
  • [5] MEME: discovering and analyzing DNA and protein sequence motifs
    Bailey, Timothy L.
    Williams, Nadya
    Misleh, Chris
    Li, Wilfred W.
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 : W369 - W373
  • [6] DREME: motif discovery in transcription factor ChIP-seq data
    Bailey, Timothy L.
    [J]. BIOINFORMATICS, 2011, 27 (12) : 1653 - 1659
  • [7] BAILEY TL, 1995, MACH LEARN, V21, P51, DOI 10.1007/BF00993379
  • [9] Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities
    Berger, Michael F.
    Philippakis, Anthony A.
    Qureshi, Aaron M.
    He, Fangxue S.
    Estep, Preston W., III
    Bulyk, Martha L.
    [J]. NATURE BIOTECHNOLOGY, 2006, 24 (11) : 1429 - 1435
  • [10] Combining genetic algorithm with machine learning strategies for designing potent antimicrobial peptides
    Boone, Kyle
    Wisdom, Cate
    Camarda, Kyle
    Spencer, Paulette
    Tamerler, Candan
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)