ChopStitch: exon annotation and splice graph construction using transcriptome assembly and whole genome sequencing data

被引:3
|
作者
Khan, Hamza [1 ]
Mohamadi, Hamid [1 ]
Vandervalk, Benjamin P. [1 ]
Warren, Rene L. [1 ]
Chu, Justin [1 ]
Birol, Inanc [1 ]
机构
[1] British Columbia Canc Agcy, Canadas Michael Smith Genome Sci Ctr, Vancouver, BC V5Z 4S6, Canada
基金
美国国家卫生研究院;
关键词
RNA-SEQ; TOOL; RECONSTRUCTION;
D O I
10.1093/bioinformatics/btx839
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Sequencing studies on non-model organisms often interrogate both genomes and transcriptomes with massive amounts of short sequences. Such studies require de novo analysis tools and techniques, when the species and closely related species lack high quality reference resources. For certain applications such as de novo annotation, information on putative exons and alternative splicing may be desirable. Results: Here we present ChopStitch, a new method for finding putative exons de novo and constructing splice graphs using an assembled transcriptome and whole genome shotgun sequencing (WGSS) data. ChopStitch identifies exon-exon boundaries in de novo assembled RNA-Seq data with the help of a Bloom filter that represents the k-mer spectrum of WGSS reads. The algorithm also accounts for base substitutions in transcript sequences that may be derived from sequencing or assembly errors, haplotype variations, or putative RNA editing events. The primary output of our tool is a FASTA file containing putative exons. Further, exon edges are interrogated for alternative exon-exon boundaries to detect transcript isoforms, which are represented as splice graphs in DOT output format.
引用
收藏
页码:1697 / 1704
页数:8
相关论文
共 22 条
  • [1] AGOUTI: improving genome assembly and annotation using transcriptome data
    Zhang, Simo V.
    Zhuo, Luting
    Hahn, Matthew W.
    GIGASCIENCE, 2016, 5
  • [2] Whole genome sequencing, assembly and annotation of the Southern Ground Hornbill - Bucorvus leadbeateri
    Patel, Jasmin
    Botes, Angela
    Mollett, Jean
    De Maayer, Pieter
    SCIENTIFIC DATA, 2025, 12 (01)
  • [3] Phylogenomic reconstruction influenced by assembly and annotation parameters: Using whole genome data to unravel the relationships of Spionidae (Annelida)
    Bogantes, Viktoria E.
    Meissner, Karin
    Waits, Damien S.
    Kocot, Kevin M.
    Halanych, Kenneth M.
    ZOOLOGICA SCRIPTA, 2024, 53 (05) : 732 - 751
  • [4] Chromosome genome assembly and annotation of the yellowbelly pufferfish with PacBio and Hi-C sequencing data
    Zhou, Yitao
    Xiao, Shijun
    Lin, Gang
    Chen, Duo
    Cen, Wan
    Xue, Ting
    Liu, Zhiyu
    Zhong, Jianxing
    Chen, Yanting
    Xiao, Yijun
    Chen, Jianhua
    Guo, Yunhai
    Chen, Youqiang
    Zhang, Yanding
    Hu, Xuefeng
    Huang, Zhen
    SCIENTIFIC DATA, 2019, 6 (1)
  • [5] INTEGRATE: gene fusion discovery using whole genome and transcriptome data
    Zhang, Jin
    White, Nicole M.
    Schmidt, Heather K.
    Fulton, Robert S.
    Tomlinson, Chad
    Warren, Wesley C.
    Wilson, Richard K.
    Maher, Christopher A.
    GENOME RESEARCH, 2016, 26 (01) : 108 - 118
  • [6] De novo assembly, gene annotation, and marker development using Illumina paired-end transcriptome sequencing in the Crassadoma gigantea
    Cao, Shanmao
    Zhu, Lijie
    Nie, Hongtao
    Yin, Minghao
    Liu, Gang
    Yan, Xiwu
    GENE, 2018, 658 : 54 - 62
  • [7] Sequencing, De novo Assembly, Functional Annotation and Analysis of Phyllanthus amarus Leaf Transcriptome Using the Illumina Platform
    Mazumdar, Aparupa Bose
    Chattopadhyay, Sharmila
    FRONTIERS IN PLANT SCIENCE, 2016, 6
  • [8] Hybrid Genome Assembly and Annotation of a Pandrug-Resistant Klebsiella pneumoniae Strain Using Nanopore and Illumina Sequencing
    Ruan, Zhi
    Wu, Jianyong
    Chen, Hangfei
    Draz, Mohamed S.
    Xu, Juan
    He, Fang
    INFECTION AND DRUG RESISTANCE, 2020, 13 : 199 - 206
  • [9] Vitis vinifera Genome Annotation Improvement Using Next-Generation Sequencing Technologies and NCBI Public Data
    Munoz, C.
    Di Genova, A.
    Maass, A.
    Orellana, A.
    Hinrichsen, P.
    Aravena, A.
    X INTERNATIONAL CONFERENCE ON GRAPEVINE BREEDING AND GENETICS, 2014, 1046 : 349 - 356
  • [10] De novo assembly, gene annotation, and molecular marker development using Illumina paired-end transcriptome sequencing in the clam Saxidomus purpuratus
    Li, Hongjun
    Liu, Min
    Ye, Sheng
    Yang, Feng
    GENES & GENOMICS, 2017, 39 (06) : 675 - 685