Characterization of intermediate-sized insertions using whole-genome sequencing data and analysis of their functional impact on gene expression

被引:0
作者
Saeideh Ashouri
Jing Hao Wong
Hidewaki Nakagawa
Mihoko Shimada
Katsushi Tokunaga
Akihiro Fujimoto
机构
[1] The University of Tokyo,Department of Human Genetics, Graduate School of Medicine
[2] Genome Medical Science Project,undefined
[3] National Center for Global Health and Medicine,undefined
[4] Laboratory for Cancer Genomics,undefined
[5] RIKEN Center for Integrative Medical Sciences,undefined
来源
Human Genetics | 2021年 / 140卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Intermediate-sized insertions are one of the structural variants contributing to genome diversity. However, due to technical difficulties in identifying them, their importance in disease pathogenicity and gene expression regulation remains unclear. We used whole-genome sequencing data of 174 Japanese samples to characterize intermediate-sized insertions using a highly-accurate insertion calling method (IMSindel software and joint-call recovery) and obtained a catalogue of 4254 insertions. We constructed an imputation panel comprising of insertions and SNVs from all samples, and conducted imputation of intermediate-sized insertions for 82 publicly-available Japanese samples. Positive Predictive Value of imputation, evaluated using Nanopore long-read sequencing data, was 97%. Subsequent eQTL analysis predicted 128 (~ 3.0%) insertions as causative for gene expression level changes. Enrichment analysis of causal insertions for genome regulatory elements showed significant associations with CTCF-binding sites, super-enhancers, and promoters. Among 17 causal insertions found in the same causal set with GWAS hits, there were insertions associated with changes in expression of cancer-related genes such as BRCA1, ZNF222, and ABCB10. Analysis of insertions sequences revealed that 461 insertions were short tandem duplications frequently found in early-replicating regions of genome. Furthermore, comparison of functional importance of intermediate-sized insertions with that of intermediate-sized deletions detected in the same sample set in our previous study showed that insertions were more frequent in genic regions, and proportion of functional candidates was smaller in insertions. Here, we characterize a high-confidence set of intermediate-sized insertions and indicate their importance in gene expression regulation. Our results emphasize the importance of considering intermediate-sized insertions in trait association studies.
引用
收藏
页码:1201 / 1216
页数:15
相关论文
共 2257 条
[1]  
Brandler WM(2018)Paternally inherited cis-regulatory structural variants are associated with autism Science 360 327-331
[2]  
Antaki D(2018)The UCSC genome browser database: 2018 update Nucleic Acids Res 46 D762-D769
[3]  
Gujral M(2016)Genome-wide association study identifies 14 novel risk alleles associated with basal cell carcinoma Nat Commun 26 151-161
[4]  
Kleiber ML(2019)Multi-platform discovery of haplotype-resolved structural variation in human genomes Nat Commun 49 692-699
[5]  
Whitney J(2019)Circular RNA ABCB10 correlates with advanced clinicopathological features and unfavorable survival, and promotes cell proliferation while reduces cell apoptosis in epithelial ovarian cancer Cancer Biomark 47 1294-1303
[6]  
Maile MS(2017)The impact of structural variation on human gene expression Nat Genet 29 1103-1108
[7]  
Hong O(2015)Large-scale genomic analyses link reproductive ageing to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair Nat Genet 21 762-74
[8]  
Chapman TR(2011)DNA replication timing and long-range DNA interactions predict mutational landscapes of cancer genomes Nat Biotechnol 489 57-509
[9]  
Tan S(2020)Towards a better understanding of the low recall of insertion variants with short-read based variant callers BMC Genomics 19 338-346
[10]  
Tandon P(2012)An integrated encyclopedia of DNA elements in the human genome Nature 48 500-49