ERVcaller: identifying polymorphic endogenous retrovirus and other transposable element insertions using whole-genome sequencing data

被引:19
作者
Chen, Xun [1 ]
Li, Dawei [1 ,2 ,3 ]
机构
[1] Univ Vermont, Dept Microbiol & Mol Genet, Burlington, VT 05405 USA
[2] Univ Vermont, Neurosci Behav & Hlth Initiat, Burlington, VT 05405 USA
[3] Univ Vermont, Dept Comp Sci, Burlington, VT 05405 USA
关键词
STRUCTURAL VARIATION; DISCOVERY; EVOLUTION; REVEALS; FORMAT;
D O I
10.1093/bioinformatics/btz205
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Approximately 8% of the human genome is derived from endogenous retroviruses (ERVs). In recent years, an increasing number of human diseases have been found to be associated with ERVs. However, it remains challenging to accurately detect the full spectrum of polymorphic (unfixed) ERVs using whole-genome sequencing (WGS) data. Results: We designed a new tool, ERVcaller, to detect and genotype transposable element (TE) insertions, including ERVs, in the human genome. We evaluated ERVcaller using both simulated and real benchmark WGS datasets. Compared to existing tools, ERVcaller consistently obtained both the highest sensitivity and precision for detecting simulated ERV and other TE insertions derived from real polymorphic TE sequences. For the WGS data from the 1000 Genomes Project, ERVcaller detected the largest number of TE insertions per sample based on consensus TE loci. By analyzing the experimentally verified TE insertions, ERVcaller had 94.0% TE detection sensitivity and 96.6% genotyping accuracy. Polymerase chain reaction and Sanger sequencing in a small sample set verified 86.7% of examined insertion statuses and 100% of examined genotypes. In conclusion, ERVcaller is capable of detecting and genotyping TE insertions using WGS data with both high sensitivity and precision. This tool can be applied broadly to other species.
引用
收藏
页码:3913 / 3922
页数:10
相关论文
共 63 条
  • [61] Discovery of unfixed endogenous retrovirus insertions in diverse human populations
    Wildschutte, Julia Halo
    Williams, Zachary H.
    Montesion, Meagan
    Subramanian, Ravi P.
    Kidd, Jeffrey M.
    Coffin, John M.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (16) : E2326 - E2334
  • [62] Tangram: a comprehensive toolbox for mobile element insertion detection
    Wu, Jiantao
    Lee, Wan-Ping
    Ward, Alistair
    Walker, Jerilyn A.
    Konkel, Miriam K.
    Batzer, Mark A.
    Marth, Gabor T.
    [J]. BMC GENOMICS, 2014, 15
  • [63] TEMP: a computational method for analyzing transposable element polymorphism in populations
    Zhuang, Jiali
    Wang, Jie
    Theurkauf, William
    Weng, Zhiping
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (11) : 6826 - 6838