MaxHiC: A robust background correction model to identify biologically relevant chromatin interactions in Hi-C and capture Hi-C experiments

被引:8
|
作者
Alinejad-Rokny, Hamid [1 ,2 ,3 ,4 ]
Modegh, Rassa Ghavami [5 ]
Rabiee, Hamid R. R. [5 ]
Sarbandi, Ehsan Ramezani
Rezaie, Narges [6 ]
Tam, Kin Tung [1 ,2 ]
Forrest, Alistair R. R. [1 ,2 ]
机构
[1] Univ Western Australia, Harry Perkins Inst Med Res, QEII Med Ctr, Perth, Australia
[2] Univ Western Australia, Ctr Med Res, Perth, Australia
[3] UNSW Sydney, Grad Sch Biomed Engn, Bio Med Machine Learning Lab BML, Sydney, Australia
[4] Macquarie Univ, Alenabled Proc AIP Res Ctr, Hlth Data Analyt Program, Sydney, Australia
[5] Sharif Univ Technol, Dept Comp Engn, Bioinformat & Computat Biol Lab, Tehran, Iran
[6] Univ Calif Irvine, Ctr Complex Biol Syst, Irvine, CA USA
基金
澳大利亚研究理事会; 英国医学研究理事会;
关键词
EXPRESSION; REVEALS; ORGANIZATION; ANNOTATION; PRINCIPLES;
D O I
10.1371/journal.pcbi.1010241
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Author summaryMaxHiC is a robust machine learning based tool for identifying significant interacting regions from both Hi-C and capture Hi-C data. All the current existing models are designed for either Hi-C or capture Hi-C data, however we developed MaxHiC to be applicable for both Hi-C and capture Hi-C libraries (two different models have been used for Hi-C and capture Hi-C data). MaxHiC is also able to analyse very deep Hi-C libraries (e.g., Micro-C) without any computational issues. MaxHiC significantly outperforms current existing Hi-C significant interaction callers and even Hi-C loop callers in terms of enrichment of interactions between known regulatory regions as well as biologically relevant interactions. Hi-C is a genome-wide chromosome conformation capture technology that detects interactions between pairs of genomic regions and exploits higher order chromatin structures. Conceptually Hi-C data counts interaction frequencies between every position in the genome and every other position. Biologically functional interactions are expected to occur more frequently than transient background and artefactual interactions. To identify biologically relevant interactions, several background models that take biases such as distance, GC content and mappability into account have been proposed. Here we introduce MaxHiC, a background correction tool that deals with these complex biases and robustly identifies statistically significant interactions in both Hi-C and capture Hi-C experiments. MaxHiC uses a negative binomial distribution model and a maximum likelihood technique to correct biases in both Hi-C and capture Hi-C libraries. We systematically benchmark MaxHiC against major Hi-C background correction tools including Hi-C significant interaction callers (SIC) and Hi-C loop callers using published Hi-C, capture Hi-C, and Micro-C datasets. Our results demonstrate that 1) Interacting regions identified by MaxHiC have significantly greater levels of overlap with known regulatory features (e.g. active chromatin histone marks, CTCF binding sites, DNase sensitivity) and also disease-associated genome-wide association SNPs than those identified by currently existing models, 2) the pairs of interacting regions are more likely to be linked by eQTL pairs and 3) more likely to link known regulatory features including known functional enhancer-promoter pairs validated by CRISPRi than any of the existing methods. We also demonstrate that interactions between different genomic region types have distinct distance distributions only revealed by MaxHiC. MaxHiC is publicly available as a python package for the analysis of Hi-C, capture Hi-C and Micro-C data.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] BAT Hi-C maps global chromatin interactions in an efficient and economical way
    Huang, Jie
    Jiang, Yongpeng
    Zheng, Haonan
    Ji, Xiong
    METHODS, 2020, 170 : 38 - 47
  • [22] Revealing Hi-C subcompartments by imputing inter-chromosomal chromatin interactions
    Xiong, Kyle
    Ma, Jian
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [23] Identifying high-confidence capture Hi-C interactions using CHiCANE
    Holgersen, Erle M.
    Gillespie, Andrea
    Leavy, Olivia C.
    Baxter, Joseph S.
    Zvereva, Alisa
    Muirhead, Gareth
    Johnson, Nichola
    Sipos, Orsolya
    Dryden, Nicola H.
    Broome, Laura R.
    Chen, Yi
    Kozin, Igor
    Dudbridge, Frank
    Fletcher, Olivia
    Haider, Syed
    NATURE PROTOCOLS, 2021, 16 (04) : 2257 - +
  • [24] DNase Hi-C — pitch-perfect chromatin mapping?
    Linda Koch
    Nature Reviews Genetics, 2015, 16 (1) : 5 - 5
  • [25] Detecting chromosomal interactions in Capture Hi-C data with CHiCAGO and companion tools
    Freire-Pritchett, Paula
    Ray-Jones, Helen
    Della Rosa, Monica
    Eijsbouts, Chris Q.
    Orchard, William R.
    Wingett, Steven W.
    Wallace, Chris
    Cairns, Jonathan
    Spivakov, Mikhail
    Malysheva, Valeriya
    NATURE PROTOCOLS, 2021, 16 (09) : 4144 - +
  • [26] Identifying high-confidence capture Hi-C interactions using CHiCANE
    Erle M. Holgersen
    Andrea Gillespie
    Olivia C. Leavy
    Joseph S. Baxter
    Alisa Zvereva
    Gareth Muirhead
    Nichola Johnson
    Orsolya Sipos
    Nicola H. Dryden
    Laura R. Broome
    Yi Chen
    Igor Kozin
    Frank Dudbridge
    Olivia Fletcher
    Syed Haider
    Nature Protocols, 2021, 16 : 2257 - 2285
  • [27] Detecting chromosomal interactions in Capture Hi-C data with CHiCAGO and companion tools
    Paula Freire-Pritchett
    Helen Ray-Jones
    Monica Della Rosa
    Chris Q. Eijsbouts
    William R. Orchard
    Steven W. Wingett
    Chris Wallace
    Jonathan Cairns
    Mikhail Spivakov
    Valeriya Malysheva
    Nature Protocols, 2021, 16 : 4144 - 4176
  • [28] Rich Chromatin Structure Prediction from Hi-C Data
    Malik, Laraib
    Patro, Rob
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019, 16 (05) : 1448 - 1458
  • [29] GOTHiC, a probabilistic model to resolve complex biases and to identify real interactions in Hi-C data
    Mifsud, Borbala
    Martincorena, Inigo
    Darbo, Elodie
    Sugar, Robert
    Schoenfelder, Stefan
    Fraser, Peter
    Luscombe, Nicholas M.
    PLOS ONE, 2017, 12 (04):
  • [30] Rich Chromatin Structure Prediction from Hi-C Data
    Malik, Laraib
    Patro, Rob
    ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 184 - 193