ChimeraTE: a pipeline to detect chimeric transcripts derived from genes and transposable elements

被引:10
作者
Oliveira, Daniel S. [1 ,2 ]
Fablet, Marie [2 ,3 ]
Larue, Anais [2 ,4 ]
Vallier, Agnes [4 ]
Carareto, Claudia M. A. [1 ]
Rebollo, Rita [4 ]
Vieira, Cristina [2 ]
机构
[1] Sao Paulo State Univ Unesp, Inst Biosci Humanities & Exact Sci, Sao Jose Do Rio Preto, SP, Brazil
[2] Univ Lyon 1, Lab Biometrie & Biol Evolut, CNRS, UMR5558, F-69100 Villeurbanne, Rhone Alpes, France
[3] IUF, F-75231 Paris, Ile De France, France
[4] Univ Lyon, INRAE, INSA Lyon, BF2I,UMR 203, F-69621 Villeurbanne, France
基金
欧盟地平线“2020”;
关键词
LONG TERMINAL REPEAT; RNA-SEQ DATA; DROSOPHILA-MELANOGASTER; INSECTICIDE RESISTANCE; DIFFERENTIAL GENE; EXPRESSION; PROMOTER; DECAY; POPULATION; SEQUENCES;
D O I
10.1093/nar/gkad671
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Transposable elements (TEs) produce structural variants and are considered an important source of genetic diversity. Notably, TE-gene fusion transcripts, i.e. chimeric transcripts, have been associated with adaptation in several species. However, the identification of these chimeras remains hindered due to the lack of detection tools at a transcriptome-wide scale, and to the reliance on a reference genome, even though different individuals/cells/strains have different TE insertions. Therefore, we developed ChimeraTE, a pipeline that uses paired-end RNA-seq reads to identify chimeric transcripts through two different modes. Mode 1 is the reference-guided approach that employs canonical genome alignment, and Mode 2 identifies chimeras derived from fixed or insertionally polymorphic TEs without any reference genome. We have validated both modes using RNA-seq data from four Dr osophila melanogaster wild-type strains. We found similar to 1.12% of all genes generating chimeric transcripts, most of them from TE-exonized sequences. Approximately similar to 23% of all detected chimeras were absent from the reference genome, indicating that TEs belonging to chimeric transcripts may be recent, polymorphic insertions. ChimeraTE is the first pipeline able to automatically uncover chimeric transcripts without a reference genome, consisting of two running Modes that can be used as a tool to investigate the contribution of TEs to transcriptome plasticity.
引用
收藏
页码:9764 / 9784
页数:21
相关论文
共 120 条
[1]   Report of a chimeric origin of transposable elements in a bovine-coding gene [J].
Almeida, L. M. ;
Amaral, M. E. J. ;
Silva, I. T. ;
Silva, W. A., Jr. ;
Riggs, P. K. ;
Carareto, C. M. .
GENETICS AND MOLECULAR RESEARCH, 2008, 7 (01) :107-116
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Fusion genes and their discovery using high throughput sequencing [J].
Annala, M. J. ;
Parker, B. C. ;
Zhang, W. ;
Nykter, M. .
CANCER LETTERS, 2013, 340 (02) :192-200
[4]   Onco-exaptation of an endogenous retroviral LTR drives IRF5 expression in Hodgkin lymphoma [J].
Babaian, A. ;
Romanish, M. T. ;
Gagnier, L. ;
Kuo, L. Y. ;
Karimi, M. M. ;
Steidl, C. ;
Mager, D. L. .
ONCOGENE, 2016, 35 (19) :2542-2546
[5]   LIONS: analysis suite for detecting and quantifying transposable element initiated transcription from RNA-seq [J].
Babaian, Artem ;
Thompson, I. Richard ;
Lever, Jake ;
Gagnier, Liane ;
Karimi, Mohammad M. ;
Mager, Dixie L. .
BIOINFORMATICS, 2019, 35 (19) :3839-3841
[6]   Endogenous retroviral promoter exaptation in human cancer [J].
Babaian, Artem ;
Mager, Dixie L. .
MOBILE DNA, 2016, 7 :1-21
[7]   Transposable element sequence fragments incorporated into coding and noncoding transcripts modulate the transcriptome of human pluripotent stem cells [J].
Babarinde, Isaac A. ;
Ma, Gang ;
Li, Yuhao ;
Deng, Boping ;
Luo, Zhiwei ;
Liu, Hao ;
Abdul, Mazid Md ;
Ward, Carl ;
Chen, Minchun ;
Fu, Xiuling ;
Shi, Liyang ;
Duttlinger, Martha ;
He, Jiangping ;
Sun, Li ;
Li, Wenjuan ;
Zhuang, Qiang ;
Tong, Guoqing ;
Frampton, Jon ;
Cazier, Jean-Baptiste ;
Chen, Jiekai ;
Jauch, Ralf ;
Esteban, Miguel A. ;
Hutchins, Andrew P. .
NUCLEIC ACIDS RESEARCH, 2021, 49 (16) :9132-9153
[8]   "One code to find them all": a perl tool to conveniently parse RepeatMasker output files [J].
Bailly-Bechet, Marc ;
Haudry, Annabelle ;
Lerat, Emmanuelle .
MOBILE DNA, 2014, 5
[9]   Endogenous retroviruses co-opted as divergently transcribed regulatory elements shape the regulatory landscape of embryonic stem cells [J].
Bakoulis, Stylianos ;
Krautz, Robert ;
Alcaraz, Nicolas ;
Salvatore, Marco ;
Andersson, Robin .
NUCLEIC ACIDS RESEARCH, 2022, 50 (04) :2111-2127
[10]   Population Genomics of Transposable Elements in Drosophila [J].
Barron, Maite G. ;
Fiston-Lavier, Anna-Sophie ;
Petrov, Dmitri A. ;
Gonzalez, Josefa .
ANNUAL REVIEW OF GENETICS, VOL 48, 2014, 48 :561-581