Transcription Factor Binding Site Prediction Using CnNet Approach

被引:0
作者
Masood, M. Mohamed Divan [1 ]
Manjula, D. [2 ]
Sugumaran, Vijayan [3 ,4 ]
机构
[1] BS Abdur Rahman Crescent Inst Sci & Technol, Dept Comp Sci & Engn, Chennai 600048, India
[2] Vellore Inst Technol, Dept Comp Sci & Engn, Chennai 600127, India
[3] Oakland Univ, Dept Decis & Informat Sci, Rochester, MI 48309 USA
[4] Oakland Univ, Ctr Data Sci & Big Data Analyt, Rochester, MI 48309 USA
关键词
DNA; Hidden Markov models; Gene expression; Pulse width modulation; Proteins; Probes; Genetics; Motif discovery; transcription factor (TF) binding site; convolution neural network (CNN); multiple expression motifs for motif elicitation (MEME); sequence specificity; MOTIF DISCOVERY; GENE-EXPRESSION; DNA; SEQUENCE; ALGORITHM; STRATEGY; SEARCH;
D O I
10.1109/TCBB.2024.3411024
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Controlling the gene expression is the most important development in a living organism, which makes it easier to find different kinds of diseases and their causes. It's very difficult to know what factors control the gene expression. Transcription Factor (TF) is a protein that plays an important role in gene expression. Discovering the transcription factor has immense biological significance, however, it is challenging to develop novel techniques and evaluation for regulatory developments in biological structures. In this research, we mainly focus on 'sequence specificities' that can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for predicting transcription factor binding. Specifically, Multiple Expression motifs for Motif Elicitation (MEME) technique with Convolution Neural Network (CNN) named as CnNet, has been used for discovering the 'sequence specificities' of DNA gene sequences dataset. This process involves two steps: a) discovering the motifs that are capable of identifying useful TF binding site by using MEME technique, and b) computing a score indicating the likelihood of a given sequence being a useful binding site by using CNN technique. The proposed CnNet approach predicts the TF binding score with much better accuracy compared to existing approaches.
引用
收藏
页码:1721 / 1730
页数:10
相关论文
共 53 条
  • [21] Deep learning
    LeCun, Yann
    Bengio, Yoshua
    Hinton, Geoffrey
    [J]. NATURE, 2015, 521 (7553) : 436 - 444
  • [22] Identifying modifications on DNA-bound histones with joint deep learning of multiple binding sites in DNA sequence
    Li, Yan
    Quan, Lijun
    Zhou, Yiting
    Jiang, Yelu
    Li, Kailong
    Wu, Tingfang
    Lyu, Qiang
    [J]. BIOINFORMATICS, 2022, 38 (17) : 4070 - 4077
  • [23] Transcription factor and microRNA motif discovery: The Amadeus platform and a compendium of metazoan target sets
    Linhart, Chaim
    Halperin, Yonit
    Shamir, Ron
    [J]. GENOME RESEARCH, 2008, 18 (07) : 1180 - 1189
  • [24] Atomistic modeling of protein-DNA interaction specificity: progress and applications
    Liu, Limin Angela
    Bradley, Philip
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2012, 22 (04) : 397 - 405
  • [25] CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks
    Luo, Jiawei
    Li, Guanghui
    Song, Dan
    Liang, Cheng
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2014, 416 : 309 - 320
  • [26] Manioudaki Maria E., 2013, Frontiers in Genetics, V4, P110, DOI 10.3389/fgene.2013.00110
  • [27] Therapeutic applications of transcription factor decoy oligonucleotides
    Mann, MJ
    Dzau, VJ
    [J]. JOURNAL OF CLINICAL INVESTIGATION, 2000, 106 (09) : 1071 - 1075
  • [28] Application of transcription factor "decoy" strategy as means of gene therapy and study of gene expression in cardiovascular disease
    Morishita, R
    Higaki, J
    Tomita, N
    Ogihara, T
    [J]. CIRCULATION RESEARCH, 1998, 82 (10) : 1023 - 1028
  • [29] A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS
    NEEDLEMAN, SB
    WUNSCH, CD
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) : 443 - +
  • [30] Quantification of absolute transcription factor binding affinities in the native chromatin context using BANC-seq
    Neikes, Hannah K.
    Kliza, Katarzyna W.
    Grawe, Cathrin
    Wester, Roelof A.
    Jansen, Pascal W. T. C.
    Lamers, Lieke A.
    Baltissen, Marijke P.
    van Heeringen, Simon J.
    Logie, Colin
    Teichmann, Sarah A.
    Lindeboom, Rik G. H.
    Vermeulen, Michiel
    [J]. NATURE BIOTECHNOLOGY, 2023, 41 (12) : 1801 - +