ERVcaller: identifying polymorphic endogenous retrovirus and other transposable element insertions using whole-genome sequencing data

被引:19
|
作者
Chen, Xun [1 ]
Li, Dawei [1 ,2 ,3 ]
机构
[1] Univ Vermont, Dept Microbiol & Mol Genet, Burlington, VT 05405 USA
[2] Univ Vermont, Neurosci Behav & Hlth Initiat, Burlington, VT 05405 USA
[3] Univ Vermont, Dept Comp Sci, Burlington, VT 05405 USA
关键词
STRUCTURAL VARIATION; DISCOVERY; EVOLUTION; REVEALS; FORMAT;
D O I
10.1093/bioinformatics/btz205
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Approximately 8% of the human genome is derived from endogenous retroviruses (ERVs). In recent years, an increasing number of human diseases have been found to be associated with ERVs. However, it remains challenging to accurately detect the full spectrum of polymorphic (unfixed) ERVs using whole-genome sequencing (WGS) data. Results: We designed a new tool, ERVcaller, to detect and genotype transposable element (TE) insertions, including ERVs, in the human genome. We evaluated ERVcaller using both simulated and real benchmark WGS datasets. Compared to existing tools, ERVcaller consistently obtained both the highest sensitivity and precision for detecting simulated ERV and other TE insertions derived from real polymorphic TE sequences. For the WGS data from the 1000 Genomes Project, ERVcaller detected the largest number of TE insertions per sample based on consensus TE loci. By analyzing the experimentally verified TE insertions, ERVcaller had 94.0% TE detection sensitivity and 96.6% genotyping accuracy. Polymerase chain reaction and Sanger sequencing in a small sample set verified 86.7% of examined insertion statuses and 100% of examined genotypes. In conclusion, ERVcaller is capable of detecting and genotyping TE insertions using WGS data with both high sensitivity and precision. This tool can be applied broadly to other species.
引用
收藏
页码:3913 / 3922
页数:10
相关论文
共 50 条
  • [1] McClintock: An Integrated Pipeline for Detecting Transposable Element Insertions in Whole-Genome Shotgun Sequencing Data
    Nelson, Michael G.
    Linheiro, Raquel S.
    Bergman, Casey M.
    G3-GENES GENOMES GENETICS, 2017, 7 (08): : 2763 - 2778
  • [2] Whole-genome variation of transposable element insertions in a maize diversity panel
    Qiu, Yinjie
    O'Connor, Christine H.
    Della Coletta, Rafael
    Renk, Jonathan S.
    Monnahan, Patrick J.
    Noshay, Jaclyn M.
    Liang, Zhikai
    Gilbert, Amanda
    Anderson, Sarah N.
    McGaugh, Suzanne E.
    Springer, Nathan M.
    Hirsch, Candice N.
    G3-GENES GENOMES GENETICS, 2021, 11 (10):
  • [3] Transposable element-mediated structural variation analysis in dog breeds using whole-genome sequencing
    Songmi Kim
    Seyoung Mun
    Taemook Kim
    Kang-Hoon Lee
    Keunsoo Kang
    Je-Yoel Cho
    Kyudong Han
    Mammalian Genome, 2019, 30 : 289 - 300
  • [4] Transposable element-mediated structural variation analysis in dog breeds using whole-genome sequencing
    Kim, Songmi
    Mun, Seyoung
    Kim, Taemook
    Lee, Kang-Hoon
    Kang, Keunsoo
    Cho, Je-Yoel
    Han, Kyudong
    MAMMALIAN GENOME, 2019, 30 (9-10) : 289 - 300
  • [5] Genome-Wide Identification of Microsatellites and Transposable Elements in the Dromedary Camel Genome Using Whole-Genome Sequencing Data
    Khalkhali-Evrigh, Reza
    Hedayat-Evrigh, Nemat
    Hafezian, Seyed Hasan
    Farhadi, Ayoub
    Bakhtiarizadeh, Mohammad Reza
    FRONTIERS IN GENETICS, 2019, 10
  • [6] Detection of long terminal repeat loci derived from endogenous retrovirus in junglefowl using whole-genome sequencing
    Shinya Ishihara
    Scientific Reports, 13 (1)
  • [7] Detection of long terminal repeat loci derived from endogenous retrovirus in junglefowl using whole-genome sequencing
    Ishihara, Shinya
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [8] Detection and annotation of transposable element insertions and deletions on the human genome using nanopore sequencing
    Cuenca-Guardiola, Javier
    de la Morena-Barrio, Belen
    Navarro-Manzano, Esther
    Stevens, Jonathan
    Ouwehand, Willem H.
    Gleadall, Nicholas S.
    Corral, Javier
    Fernandez-Breis, Jesualdo Tomas
    ISCIENCE, 2023, 26 (11)
  • [9] Analysis of homozygosity disequilibrium using whole-genome sequencing data
    Hsin-Chou Yang
    Han-Wei Li
    BMC Proceedings, 8 (Suppl 1)
  • [10] Characterization of intermediate-sized insertions using whole-genome sequencing data and analysis of their functional impact on gene expression
    Ashouri, Saeideh
    Wong, Jing Hao
    Nakagawa, Hidewaki
    Shimada, Mihoko
    Tokunaga, Katsushi
    Fujimoto, Akihiro
    HUMAN GENETICS, 2021, 140 (08) : 1201 - 1216