共 50 条
Prediction of the 3D cancer genome from whole-genome sequencing using InfoHiC
被引:0
|作者:
Lee, Yeonghun
[1
]
Park, Sung-Hye
[2
,3
]
Lee, Hyunju
[1
,4
]
机构:
[1] Gwangju Inst Sci & Technol, Sch Elect Engn & Comp Sci, 123 Cheomdangwagi Ro, Gwangju 61005, South Korea
[2] Seoul Natl Univ, Seoul Natl Univ Hosp, Coll Med, Dept Pathol, 103 Daehak Ro, Seoul 03080, South Korea
[3] Seoul Natl Univ, Coll Med, Neurosci Res Inst, 103 Daehak Ro, Seoul 03080, South Korea
[4] Gwangju Inst Sci & Technol, AI Grad Sch, 123 Cheomdangwagi Ro, Gwangju 61005, South Korea
关键词:
Hi-C Prediction;
3D Genome;
Structural Variation;
Cancer Genome;
Deep Learning;
STRUCTURAL VARIATION;
HI-C;
VARIANTS;
QUANTIFICATION;
ASSOCIATION;
LANDSCAPE;
DISCOVERY;
D O I:
10.1038/s44320-024-00065-2
中图分类号:
Q5 [生物化学];
Q7 [分子生物学];
学科分类号:
071010 ;
081704 ;
摘要:
The 3D genome prediction in cancer is crucial for uncovering the impact of structural variations (SVs) on tumorigenesis, especially when they are present in noncoding regions. We present InfoHiC, a systemic framework for predicting the 3D cancer genome directly from whole-genome sequencing (WGS). InfoHiC utilizes contig-specific copy number encoding on the SV contig assembly, and performs a contig-to-total Hi-C conversion for the cancer Hi-C prediction from multiple SV contigs. We showed that InfoHiC can predict 3D genome folding from all types of SVs using breast cancer cell line data. We applied it to WGS data of patients with breast cancer and pediatric patients with medulloblastoma, and identified neo topologically associating domains. For breast cancer, we discovered super-enhancer hijacking events associated with oncogenic overexpression and poor survival outcomes. For medulloblastoma, we found SVs in noncoding regions that caused super-enhancer hijacking events of medulloblastoma driver genes (GFI1, GFI1B, and PRDM6). In addition, we provide trained models for cancer Hi-C prediction from WGS at https://github.com/dmcb-gist/InfoHiC, uncovering the impacts of SVs in cancer patients and revealing novel therapeutic targets. InfoHiC is developed for cancer 3D genome prediction enabling to find neo-TADs and neo-loops starting from whole genome sequencing reads.InfoHiC is trained from available cancer Hi-C data and predicts Hi-C matrices for patients with cancer, which eliminates the burden of high Hi-C sequencing costs.InfoHiC takes whole genome sequencing reads in contrast to other Hi-C prediction tools requiring pre-defined sequences, enabling Hi-C prediction from cancer genomes where users cannot define sequence inputs easily.It analyzes all structural variation types including those found in non-coding regions, and predicts the impact on cancer development in the 3D genome context.It can lead to personalized medication by discovering neo-TADs and neo-loops in patients where common coding driver mutations or copy number changes are not found. InfoHiC is developed for cancer 3D genome prediction enabling to find neo-TADs and neo-loops starting from whole genome sequencing reads.
引用
收藏
页码:1156 / 1172
页数:17
相关论文