Bellerophontes: an RNA-Seq data analysis framework for chimeric transcripts discovery based on accurate fusion model

被引:23
作者
Abate, Francesco [1 ]
Acquaviva, Andrea [1 ]
Paciello, Giulia [1 ]
Foti, Carmelo [1 ]
Ficarra, Elisa [1 ]
Ferrarini, Alberto [2 ]
Delledonne, Massimo [2 ]
Iacobucci, Ilaria [3 ]
Soverini, Simona [3 ]
Martinelli, Giovanni [3 ]
Macii, Enrico [1 ]
机构
[1] Politecn Torino, Dept Control & Comp Engn, I-10129 Turin, Italy
[2] Univ Verona, Dept Biotechnol, I-37134 Verona, Italy
[3] Univ Bologna, Inst Med Oncol & Hematol, I-40138 Bologna, Italy
关键词
SPLICE JUNCTIONS; GENE FUSIONS; CANCER;
D O I
10.1093/bioinformatics/bts334
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Next-generation sequencing technology allows the detection of genomic structural variations, novel genes and transcript isoforms from the analysis of high-throughput data. In this work, we propose a new framework for the detection of fusion transcripts through short paired-end reads which integrates splicing-driven alignment and abundance estimation analysis, producing a more accurate set of reads supporting the junction discovery and taking into account also not annotated transcripts. Bellerophontes performs a selection of putative junctions on the basis of a match to an accurate gene fusion model. Results: We report the fusion genes discovered by the proposed framework on experimentally validated biological samples of chronic myelogenous leukemia (CML) and on public NCBI datasets, for which Bellerophontes is able to detect the exact junction sequence. With respect to state-of-art approaches, Bellerophontes detects the same experimentally validated fusions, however, it is more selective on the total number of detected fusions and provides a more accurate set of spanning reads supporting the junctions. We finally report the fusions involving non-annotated transcripts found in CML samples.
引用
收藏
页码:2114 / 2121
页数:8
相关论文
共 21 条
  • [1] Global and unbiased detection of splice junctions from RNA-seq data
    Ameur, Adam
    Wetterbom, Anna
    Feuk, Lars
    Gyllensten, Ulf
    [J]. GENOME BIOLOGY, 2010, 11 (03):
  • [2] Integrative analysis of the melanoma transcriptome
    Berger, Michael F.
    Levin, Joshua Z.
    Vijayendran, Krishna
    Sivachenko, Andrey
    Adiconis, Xian
    Maguire, Jared
    Johnson, Laura A.
    Robinson, James
    Verhaak, Roel G.
    Sougnez, Carrie
    Onofrio, Robert C.
    Ziaugra, Liuda
    Cibulskis, Kristian
    Laine, Elisabeth
    Barretina, Jordi
    Winckler, Wendy
    Fisher, David E.
    Getz, Gad
    Meyerson, Matthew
    Jaffe, David B.
    Gabriel, Stacey B.
    Lander, Eric S.
    Dummer, Reinhard
    Gnirke, Andreas
    Nusbaum, Chad
    Garraway, Levi A.
    [J]. GENOME RESEARCH, 2010, 20 (04) : 413 - 427
  • [3] Supersplat-spliced RNA-seq alignment
    Bryant, Douglas W., Jr.
    Shen, Rongkun
    Priest, Henry D.
    Wong, Weng-Keen
    Mockler, Todd C.
    [J]. BIOINFORMATICS, 2010, 26 (12) : 1500 - 1505
  • [4] Dongen J. J., 1999, LEUKEMIA, V13, P110
  • [5] Identification of fusion genes in breast cancer by paired-end RNA-sequencing
    Edgren, Henrik
    Murumagi, Astrid
    Kangaspeska, Sara
    Nicorici, Daniel
    Hongisto, Vesa
    Kleivi, Kristine
    Rye, Inga H.
    Nyberg, Sandra
    Wolf, Maija
    Borresen-Dale, Anne-Lise
    Kallioniemi, Olli
    [J]. GENOME BIOLOGY, 2011, 12 (01):
  • [6] The UCSC Genome Browser database: update 2011
    Fujita, Pauline A.
    Rhead, Brooke
    Zweig, Ann S.
    Hinrichs, Angie S.
    Karolchik, Donna
    Cline, Melissa S.
    Goldman, Mary
    Barber, Galt P.
    Clawson, Hiram
    Coelho, Antonio
    Diekhans, Mark
    Dreszer, Timothy R.
    Giardine, Belinda M.
    Harte, Rachel A.
    Hillman-Jackson, Jennifer
    Hsu, Fan
    Kirkup, Vanessa
    Kuhn, Robert M.
    Learned, Katrina
    Li, Chin H.
    Meyer, Laurence R.
    Pohl, Andy
    Raney, Brian J.
    Rosenbloom, Kate R.
    Smith, Kayla E.
    Haussler, David
    Kent, W. James
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D876 - D882
  • [7] FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution
    Ge, Huanying
    Liu, Kejun
    Juan, Todd
    Fang, Fang
    Newman, Matthew
    Hoeck, Wolfgang
    [J]. BIOINFORMATICS, 2011, 27 (14) : 1922 - 1928
  • [8] ChimeraScan: a tool for identifying chimeric transcription in sequencing data
    Iyer, Matthew K.
    Chinnaiyan, Arul M.
    Maher, Christopher A.
    [J]. BIOINFORMATICS, 2011, 27 (20) : 2903 - 2904
  • [9] Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
    Langmead, Ben
    Trapnell, Cole
    Pop, Mihai
    Salzberg, Steven L.
    [J]. GENOME BIOLOGY, 2009, 10 (03):
  • [10] TreeFam:: a curated database of phylogenetic trees of animal gene families
    Li, Heng
    Coghlan, Avril
    Ruan, Jue
    Coin, Lachlan James
    Heriche, Jean-Karim
    Osmotherly, Lara
    Li, Ruiqiang
    Liu, Tao
    Zhang, Zhang
    Bolund, Lars
    Wong, Gane Ka-Shu
    Zheng, Weimou
    Dehal, Paramvir
    Wang, Jun
    Durbin, Richard
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 : D572 - D580