Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning

被引:1
|
作者
Jing, Fang [1 ]
Zhang, Shao-Wu [1 ]
Cao, Zhen [2 ]
Zhang, Shihua l [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Coll Automat, Minist Educ, Key Lab Informat Fusion Technol, Xian 710072, Shaanxi, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, NCMIS, CEMS,RCSDS, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
来源
BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018 | 2018年 / 10847卷
基金
中国国家自然科学基金;
关键词
Bioinformatics; Machine learning; Transcription factors binding sites; Convolutional neural networks; DNA accessibility; Histone modification; CHROMATIN ACCESSIBILITY PREDICTION; NETWORKS;
D O I
10.1007/978-3-319-94968-0_23
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowing the transcription factor binding sites (TFBSs) is essential for modeling the underlying binding mechanisms and follow-up cellular functions. Convolutional neural networks (CNNs) have outperformed methods in predicting TFBSs from the primary DNA sequence. In addition to DNA sequences, histone modifications and chromatin accessibility are also important factors influencing their activity. They have been explored to predict TFBSs recently. However, current methods rarely take into account histone modifications and chromatin accessibility using CNN in an integrative framework. To this end, we developed a general CNN model to integrate these data for predicting TFBSs. We systematically benchmarked a series of architecture variants by changing network structure in terms of width and depth, and explored the effects of sample length at flanking regions. We evaluated the performance of the three types of data and their combinations using 256 ChIP-seq experiments and also compared it with competing machine learning methods. We find that contributions from these three types of data are complementary to each other. Moreover, the integrative CNN framework is superior to traditional machine learning methods with significant improvements.
引用
收藏
页码:241 / 252
页数:12
相关论文
共 50 条
  • [31] TDTHub, a web server tool for the analysis of transcription factor binding sites in plants
    Grau, Joaquin
    Franco-Zorrilla, Jose M.
    PLANT JOURNAL, 2022, 111 (04) : 1203 - 1215
  • [32] A Computational Approach to Identify Transcription Factor Binding Sites Containing Spacer Regions
    Sundaramurthy, Punithavathi
    White, Brandon
    Lee, Wendy
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 366 - 369
  • [33] Evolution of DNA-Binding Sites of a Floral Master Regulatory Transcription Factor
    Muino, Jose M.
    de Bruijn, Suzanne
    Pajoro, Alice
    Geuten, Koen
    Vingron, Martin
    Angenent, Gerco C.
    Kaufmann, Kerstin
    MOLECULAR BIOLOGY AND EVOLUTION, 2016, 33 (01) : 185 - 200
  • [34] The Circular Economy and retail: using Deep Learning to predict business survival
    Uribe-Toril, Juan
    Ruiz-Real, Jose Luis
    Galindo Duran, Alejandro C.
    Torres Arriaza, Jose Antonio
    de Pablo Valenciano, Jaime
    ENVIRONMENTAL SCIENCES EUROPE, 2022, 34 (01)
  • [35] Predicting bacterial transcription factor binding sites through machine learning and structural characterization based on DNA duplex stability
    Borges Farias, Andre
    Martinez, Gustavo Sganzerla
    Galan-Vasquez, Edgardo
    Nicolas, Marisa Fabiana
    Perez-Rueda, Ernesto
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [36] Using deep learning to identify recent positive selection in malaria parasite sequence data
    Wouter Deelder
    Ernest Diez Benavente
    Jody Phelan
    Emilia Manko
    Susana Campino
    Luigi Palla
    Taane G. Clark
    Malaria Journal, 20
  • [37] Using deep learning to identify recent positive selection in malaria parasite sequence data
    Deelder, Wouter
    Benavente, Ernest Diez
    Phelan, Jody
    Manko, Emilia
    Campino, Susana
    Palla, Luigi
    Clark, Taane G.
    MALARIA JOURNAL, 2021, 20 (01)
  • [38] FCNGRU: Locating Transcription Factor Binding Sites by Combing Fully Convolutional Neural Network With Gated Recurrent Unit
    Wang, Siguo
    He, Ying
    Chen, Zhanheng
    Zhang, Qinhu
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (04) : 1883 - 1890
  • [39] Local Epigenomic Data are more Informative than Local Genome Sequence Data in Predicting Enhancer-Promoter Interactions Using Neural Networks
    Xiao, Mengli
    Zhuang, Zhong
    Pan, Wei
    GENES, 2020, 11 (01)
  • [40] Automated Monitoring of Construction Sites of Electric Power Substations Using Deep Learning
    Soares Oliveira, Bruno Alberto
    De Faria Neto, Abilio Pereira
    Arruda Fernandino, Roberto Marcio
    Carvalho, Rogerio Fernandes
    Fernandes, Amanda Lopes
    Guimaraes, Frederico Gadelha
    IEEE ACCESS, 2021, 9 : 19195 - 19207