Single-molecule Real-time (SMRT) Sequencing Facilitates Transcriptome Research and Genome Annotation of the Fish Sillago sinica

被引:3
作者
Zhang, Yuan [1 ]
Lou, Fangrui [2 ]
Chen, Jianwei [3 ]
Han, Zhiqiang [4 ]
Yang, Tianyan [4 ]
Gao, Tianxiang [4 ]
Song, Na [1 ]
机构
[1] Ocean Univ China, Fishery Coll, Qingdao 266003, Peoples R China
[2] Yantai Univ, Sch Ocean, Yantai 264005, Peoples R China
[3] BGI Qingdao, BGI Shenzhen, Qingdao 266555, Peoples R China
[4] Zhejiang Ocean Univ, Fishery Coll, Zhoushan 316022, Peoples R China
基金
中国国家自然科学基金;
关键词
Full-length transcriptome; Genome annotation; Sand whiting; Sillago sinica; SMRT sequencing; RNA-SEQ; PROTEIN; GENE; EXPRESSION; RECONSTRUCTION; SILLAGINIDAE; PREDICTION; EVOLUTION; ACCURACY; WATERS;
D O I
10.1007/s10126-022-10163-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
As a newly described Sillaginidae species, Chinese sillago (Sillago sinica) needs a better understanding of gene annotation information. In this study, we reported the first full-length transcriptome data of S. sinica using the PacBio isoform sequencing Iso-seq and a description of transcriptome structure analysis. A total of 454,979 high-quality full-length transcripts were obtained by single-molecule real-time (SMRT) sequencing, which was corrected by Illumina sequencing data. After that, 66,948 non-redundant full-length transcripts were generated after mapping to the reference genome of S. sinica, including 49 fusion isoforms and 9,250 novel isoforms. 63,459 isoforms were successfully annotated by one of the Nr, Nt, SwissProt, Pfam, KOG, GO, and KEGG databases. Additionally, 30,987 alternative polyadenylation (APA) sites, 451,867 alternative splicing (AS) events, 21,928 long non-coding RNAs (lncRNAs) and 12,911 transcription factors (TFs) were identified. The full-length transcripts of S. sinica would provide a precious resource for characterizing the transcriptome of S. sinica and for the further study of gene function and regulatory mechanism of this species.
引用
收藏
页码:1002 / 1013
页数:12
相关论文
共 61 条
[1]   A survey of the sorghum transcriptome using single-molecule long reads [J].
Abdel-Ghany, Salah E. ;
Hamilton, Michael ;
Jacobi, Jennifer L. ;
Ngam, Peter ;
Devitt, Nicholas ;
Schilkey, Faye ;
Ben-Hur, Asa ;
Reddy, Anireddy S. N. .
NATURE COMMUNICATIONS, 2016, 7
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   Improving PacBio Long Read Accuracy by Short Read Alignment [J].
Au, Kin Fai ;
Underwood, Jason G. ;
Lee, Lawrence ;
Wong, Wing Hung .
PLOS ONE, 2012, 7 (10)
[5]  
Bae Seung Eun, 2013, Animal Systematics Evolution and Diversity, V29, P288, DOI 10.5635/ASED.2013.29.4.288
[6]   Fast and sensitive protein alignment using DIAMOND [J].
Buchfink, Benjamin ;
Xie, Chao ;
Huson, Daniel H. .
NATURE METHODS, 2015, 12 (01) :59-60
[7]   Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing [J].
Chao, Yuehui ;
Yuan, Jianbo ;
Li, Sifeng ;
Jia, Siqiao ;
Han, Liebao ;
Xu, Lixin .
BMC PLANT BIOLOGY, 2018, 18
[8]   SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data [J].
Chen, Yuxin ;
Chen, Yongsheng ;
Shi, Chunmei ;
Huang, Zhibo ;
Zhang, Yong ;
Li, Shengkang ;
Li, Yan ;
Ye, Jia ;
Yu, Chang ;
Li, Zhuo ;
Zhang, Xiuqing ;
Wang, Jian ;
Yang, Huanming ;
Fang, Lin ;
Chen, Qiang .
GIGASCIENCE, 2017, 7 (01) :1-6
[9]   Blast2GO:: a universal tool for annotation, visualization and analysis in functional genomics research [J].
Conesa, A ;
Götz, S ;
García-Gómez, JM ;
Terol, J ;
Talón, M ;
Robles, M .
BIOINFORMATICS, 2005, 21 (18) :3674-3676
[10]   Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research [J].
Dong, Lingli ;
Liu, Hongfang ;
Zhang, Juncheng ;
Yang, Shuangjuan ;
Kong, Guanyi ;
Chu, Jeffrey S. C. ;
Chen, Nansheng ;
Wang, Daowen .
BMC GENOMICS, 2015, 16