RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing

被引:26
作者
Chen, Jinfeng [1 ,2 ,3 ]
Wrightsman, Travis R. [3 ]
Wessler, Susan R. [2 ,3 ]
Stajich, Jason E. [1 ,2 ]
机构
[1] Univ Calif Riverside, Dept Plant Pathol & Microbiol, Riverside, CA 92521 USA
[2] Univ Calif Riverside, Inst Integrat Genome Biol, Riverside, CA 92521 USA
[3] Univ Calif Riverside, Dept Bot & Plant Sci, Riverside, CA 92521 USA
来源
PEERJ | 2017年 / 5卷
基金
美国国家科学基金会;
关键词
Annotation; Diversity; Parallel processing; Transposons; Population genomics; Short read; Bioinformatics; Rice; Resequencing; GENERATION SEQUENCING DATA; GENE REGULATORY NETWORKS; HUMAN GENOME; STRUCTURAL VARIATION; EVOLUTION; RICE; IDENTIFICATION; ALIGNMENT; READS;
D O I
10.7717/peerj.2942
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background, Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. Methods. We have developed the tool RelocaTE2 for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. Results and Discussion. The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing.
引用
收藏
页数:12
相关论文
共 25 条
  • [1] The Contributions of Transposable Elements to the Structure, Function, and Evolution of Plant Genomes
    Bennetzen, Jeffrey L.
    Wang, Hao
    [J]. ANNUAL REVIEW OF PLANT BIOLOGY, VOL 65, 2014, 65 : 505 - 530
  • [2] Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing
    Campbell, Peter J.
    Stephens, Philip J.
    Pleasance, Erin D.
    O'Meara, Sarah
    Li, Heng
    Santarius, Thomas
    Stebbings, Lucy A.
    Leroy, Catherine
    Edkins, Sarah
    Hardy, Claire
    Teague, Jon W.
    Menzies, Andrew
    Goodhead, Ian
    Turner, Daniel J.
    Clee, Christopher M.
    Quail, Michael A.
    Cox, Antony
    Brown, Clive
    Durbin, Richard
    Hurles, Matthew E.
    Edwards, Paul A. W.
    Bignell, Graham R.
    Stratton, Michael R.
    Futreal, P. Andrew
    [J]. NATURE GENETICS, 2008, 40 (06) : 722 - 729
  • [3] The impact of retrotransposons on human genome evolution
    Cordaux, Richard
    Batzer, Mark A.
    [J]. NATURE REVIEWS GENETICS, 2009, 10 (10) : 691 - 703
  • [4] Transposable Elements Re-Wire and Fine-Tune the Transcriptome
    Cowley, Michael
    Oakey, Rebecca J.
    [J]. PLOS GENETICS, 2013, 9 (01):
  • [5] Opinion - Transposable elements and the evolution of regulatory networks
    Feschotte, Cedric
    [J]. NATURE REVIEWS GENETICS, 2008, 9 (05) : 397 - 405
  • [6] Next-generation VariationHunter: combinatorial algorithms for transposon insertion discovery
    Hormozdiari, Fereydoun
    Hajirasouliha, Iman
    Dao, Phuong
    Hach, Faraz
    Yorukoglu, Deniz
    Alkan, Can
    Eichler, Evan E.
    Sahinalp, S. Cenk
    [J]. BIOINFORMATICS, 2010, 26 (12) : i350 - i357
  • [7] pIRS: Profile-based Illumina pair-end reads simulator
    Hu, Xuesong
    Yuan, Jianying
    Shi, Yujian
    Lu, Jianliang
    Liu, Binghang
    Li, Zhenyu
    Chen, Yanxiang
    Mu, Desheng
    Zhang, Hao
    Li, Nan
    Yue, Zhen
    Bai, Fan
    Li, Heng
    Fan, Wei
    [J]. BIOINFORMATICS, 2012, 28 (11) : 1533 - 1535
  • [8] ITIS, a bioinformatics tool for accurate identification of transposon insertion sites using next-generation sequencing data
    Jiang, Chuan
    Chen, Chao
    Huang, Ziyue
    Liu, Renyi
    Verdier, Jerome
    [J]. BMC BIOINFORMATICS, 2015, 16
  • [9] RetroSeq: transposable element discovery from next-generation sequencing data
    Keane, Thomas M.
    Wong, Kim
    Adams, David J.
    [J]. BIOINFORMATICS, 2013, 29 (03) : 389 - 390
  • [10] Kent WJ, 2002, GENOME RES, V12, P656, DOI 10.1101/gr.229202. Article published online before March 2002