CLNN-loop: a deep learning model to predict CTCF-mediated chromatin loops in the different cell lines and CTCF-binding sites (CBS) pair types

被引:38
作者
Zhang, Pengyu [1 ,2 ]
Wu, Yingfu [2 ]
Zhou, Haoru [2 ]
Zhou, Bing [2 ]
Zhang, Hongming [2 ]
Wu, Hao [1 ]
机构
[1] Shandong Univ, Sch Software, Jinan 250101, Shandong, Peoples R China
[2] Northwest A&F Univ, Coll Informat Engn, Yangling 712100, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
REARRANGEMENTS; PRINCIPLES;
D O I
10.1093/bioinformatics/btac575
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Three-dimensional (3D) genome organization is of vital importance in gene regulation and disease mechanisms. Previous studies have shown that CTCF-mediated chromatin loops are crucial to studying the 3D structure of cells. Although various experimental techniques have been developed to detect chromatin loops, they have been found to be time-consuming and costly. Nowadays, various sequence-based computational methods can capture significant features of 3D genome organization and help predict chromatin loops. However, these methods have low performance and poor generalization ability in predicting chromatin loops. Results: Here, we propose a novel deep learning model, called CLNN-loop, to predict chromatin loops in different cell lines and CTCF-binding sites (CBS) pair types by fusing multiple sequence-based features. The analysis of a series of examinations based on the datasets in the previous study shows that CLNN-loop has satisfactory performance and is superior to the existing methods in terms of predicting chromatin loops. In addition, we apply the SHAP framework to interpret the predictions of different models, and find that CTCF motif and sequence conservation are important signs of chromatin loops in different cell lines and CBS pair types.
引用
收藏
页码:4497 / 4504
页数:8
相关论文
共 44 条
  • [1] Three-dimensional Epigenome Statistical Model: Genome-wide Chromatin Looping Prediction
    Al Bkhetan, Ziad
    Plewczynski, Dariusz
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [2] AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION
    ALTMAN, NS
    [J]. AMERICAN STATISTICIAN, 1992, 46 (03) : 175 - 185
  • [3] Organization and function of the 3D genome(vol 17, 661, 2016)
    Bonev, Boyan
    Cavalli, Giacomo
    [J]. NATURE REVIEWS GENETICS, 2016, 17 (12) : 772 - 772
  • [4] iEnhancer-XG: interpretable sequence-based enhancers and their strength predictor
    Cai, Lijun
    Ren, Xuanbai
    Fu, Xiangzheng
    Peng, Li
    Gao, Mingyu
    Zeng, Xiangxiang
    [J]. BIOINFORMATICS, 2021, 37 (08) : 1060 - 1067
  • [5] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [6] iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data
    Chen, Zhen
    Zhao, Pei
    Li, Fuyi
    Marquez-Lago, Tatiana T.
    Leier, Andre
    Revote, Jerico
    Zhu, Yan
    Powell, David R.
    Akutsu, Tatsuya
    Webb, Geoffrey, I
    Chou, Kuo-Chen
    Smith, A. Ian
    Daly, Roger J.
    Li, Jian
    Song, Jiangning
    [J]. BRIEFINGS IN BIOINFORMATICS, 2020, 21 (03) : 1047 - 1057
  • [7] Three-dimensional genome organization in normal and malignant haematopoiesis
    Cuartero, Sergi
    Merkenschlager, Matthias
    [J]. CURRENT OPINION IN HEMATOLOGY, 2018, 25 (04) : 323 - 328
  • [8] Gene regulation in the third dimension
    Dekker, Job
    [J]. SCIENCE, 2008, 319 (5871) : 1793 - 1794
  • [9] Structural and functional diversity of Topologically Associating Domains
    Dekker, Job
    Heard, Edith
    [J]. FEBS LETTERS, 2015, 589 (20) : 2877 - 2884
  • [10] Formation of Chromosomal Domains by Loop Extrusion
    Fudenberg, Geoffrey
    Imakaev, Maxim
    Lu, Carolyn
    Goloborodko, Anton
    Abdennur, Nezar
    Mirny, Leonid A.
    [J]. CELL REPORTS, 2016, 15 (09): : 2038 - 2049