ERVcaller: identifying polymorphic endogenous retrovirus and other transposable element insertions using whole-genome sequencing data

被引:19
|
作者
Chen, Xun [1 ]
Li, Dawei [1 ,2 ,3 ]
机构
[1] Univ Vermont, Dept Microbiol & Mol Genet, Burlington, VT 05405 USA
[2] Univ Vermont, Neurosci Behav & Hlth Initiat, Burlington, VT 05405 USA
[3] Univ Vermont, Dept Comp Sci, Burlington, VT 05405 USA
关键词
STRUCTURAL VARIATION; DISCOVERY; EVOLUTION; REVEALS; FORMAT;
D O I
10.1093/bioinformatics/btz205
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Approximately 8% of the human genome is derived from endogenous retroviruses (ERVs). In recent years, an increasing number of human diseases have been found to be associated with ERVs. However, it remains challenging to accurately detect the full spectrum of polymorphic (unfixed) ERVs using whole-genome sequencing (WGS) data. Results: We designed a new tool, ERVcaller, to detect and genotype transposable element (TE) insertions, including ERVs, in the human genome. We evaluated ERVcaller using both simulated and real benchmark WGS datasets. Compared to existing tools, ERVcaller consistently obtained both the highest sensitivity and precision for detecting simulated ERV and other TE insertions derived from real polymorphic TE sequences. For the WGS data from the 1000 Genomes Project, ERVcaller detected the largest number of TE insertions per sample based on consensus TE loci. By analyzing the experimentally verified TE insertions, ERVcaller had 94.0% TE detection sensitivity and 96.6% genotyping accuracy. Polymerase chain reaction and Sanger sequencing in a small sample set verified 86.7% of examined insertion statuses and 100% of examined genotypes. In conclusion, ERVcaller is capable of detecting and genotyping TE insertions using WGS data with both high sensitivity and precision. This tool can be applied broadly to other species.
引用
收藏
页码:3913 / 3922
页数:10
相关论文
共 50 条
  • [21] Investigation of selection signatures of dairy goats using whole-genome sequencing data
    Peng, Weifeng
    Zhang, Yiyuan
    Gao, Lei
    Wang, Shuping
    Liu, Mengting
    Sun, Enrui
    Lu, Kaixin
    Zhang, Yunxia
    Li, Bing
    Li, Guoyin
    Cao, Jingya
    Yang, Mingsheng
    Guo, Yanfeng
    Wang, Mengyun
    Zhang, Yuming
    Wang, Zihan
    Han, Yan
    Fan, Shuhua
    Huang, Li
    BMC GENOMICS, 2025, 26 (01):
  • [22] Population analysis of the Korean native duck using whole-genome sequencing data
    Daehwan Lee
    Jongin Lee
    Kang-Neung Heo
    Kisang Kwon
    Youngbeen Moon
    Dajeong Lim
    Kyung-Tai Lee
    Jaebum Kim
    BMC Genomics, 21
  • [23] GENOME-WIDE ASSOCIATION STUDY OF EXTREME LONGEVITY USING WHOLE-GENOME SEQUENCING DATA
    Gurinovich, Anastasia
    Bae, Harold
    Song, Zeyuan
    Leshchyk, Anastasia
    Li, Mengze
    Andersen, Stacy
    Perls, Thomas
    Sebastiani, Paola
    INNOVATION IN AGING, 2022, 6 : 395 - 395
  • [24] Identification of transposable element-mediated deletions in 27 Korean individuals based on whole genome sequencing data
    Ha, Jungsu
    Lee, Wooseok
    Mun, Seyoung
    Kim, Yun-Ji
    Han, Kyudong
    GENES & GENOMICS, 2016, 38 (02) : 179 - 192
  • [25] Identification and characterisation of endogenous Avian Leukosis Virus subgroup E (ALVE) insertions in chicken whole genome sequencing data
    Mason, Andrew S.
    Lund, Ashlee R.
    Hocking, Paul M.
    Fulton, Janet E.
    Burt, David W.
    MOBILE DNA, 2020, 11 (01)
  • [26] Identification and characterisation of endogenous Avian Leukosis Virus subgroup E (ALVE) insertions in chicken whole genome sequencing data
    Andrew S. Mason
    Ashlee R. Lund
    Paul M. Hocking
    Janet E. Fulton
    David W. Burt
    Mobile DNA, 11
  • [27] Identification of transposable element-mediated deletions in 27 Korean individuals based on whole genome sequencing data
    Jungsu Ha
    Wooseok Lee
    Seyoung Mun
    Yun-Ji Kim
    Kyudong Han
    Genes & Genomics, 2016, 38 : 179 - 192
  • [28] Characterization of runs of homozygosity islands in American mink using whole-genome sequencing data
    Davoudi, Pourya
    Do, Duy Ngoc
    Rathgeber, Bruce
    Colombo, Stefanie
    Sargolzaei, Mehdi
    Plastow, Graham
    Wang, Zhiquan
    Miar, Younes
    JOURNAL OF ANIMAL BREEDING AND GENETICS, 2024, 141 (05) : 507 - 520
  • [29] Assessing the digenic model in rare disorders using population whole-genome sequencing data
    Moreno-Ruiz, Nerea
    Lao, Oscar
    Ignacio Arostegui, Juan
    Laayouni, Hafid
    Casals, Ferran
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2023, 31 : 579 - 579
  • [30] Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis
    Wainaina, Lynda
    Merlotti, Alessandra
    Remondini, Daniel
    Henri, Clementine
    Hald, Tine
    Njage, Patrick Murigu Kamau
    PATHOGENS, 2022, 11 (06):