Comprehensive annotation of the Chinese tree shrew genome by large-scale RNA sequencing and long-read isoform sequencing

被引:27
作者
Ye, Mao-Sen [1 ,2 ]
Zhang, Jin-Yan [1 ,2 ]
Yu, Dan-Dan [1 ,3 ]
Xu, Min [1 ,3 ]
Xu, Ling [1 ,3 ]
Lv, Long-Bao [3 ]
Zhu, Qi-Yun [4 ]
Fan, Yu [1 ,3 ]
Yao, Yong-Gang [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Key Lab Anim Models & Human Dis Mech, Kunming Inst Zool, KIZ CUHK Joint Lab Bioresources & Mol Res Common, Kunming 650204, Yunnan, Peoples R China
[2] Univ Chinese Acad Sci, Kunming Coll Life Sci, Kunming 650204, Yunnan, Peoples R China
[3] Chinese Acad Sci, Natl Resource Ctr Nonhuman Primates, Kunming Inst Zool, Natl Res Facil Phenotyp & Genet Anal Model Anim P, Kunming 650107, Yunnan, Peoples R China
[4] Chinese Acad Agr Sci, Lanzhou Vet Res Inst, State Key Lab Vet Etiol Biol, Lanzhou 730046, Gansu, Peoples R China
基金
中国国家自然科学基金;
关键词
Tree shrew; Genome annotation; Transcriptome; Gene family; Virus infection; TUPAIA-BELANGERI; ANIMAL-MODELS; INDUCED MYOPIA; GENE; PROTEIN; FAMILY; TRANSCRIPTOME; BIOGENESIS; GENERATION; PRIMATES;
D O I
10.24272/j.issn.2095-8137.2021.272
中图分类号
Q95 [动物学];
学科分类号
071002 ;
摘要
The Chinese tree shrew (Tupaia belangeri chinensis) is emerging as an important experimental animal in multiple fields of biomedical research. Comprehensive reference genome annotation for both mRNA and long non-coding RNA (lncRNA) is crucial for developing animal models using this species. In the current study, we collected a total of 234 high-quality RNA sequencing (RNA-seq) datasets and two long-read isoform sequencing (ISO-seq) datasets and improved the annotation of our previously assembled high-quality chromosome-level tree shrew genome. We obtained a total of 3 514 newly annotated coding genes and 50 576 lncRNA genes. We also characterized the tissue-specific expression patterns and alternative splicing patterns of mRNAs and lncRNAs and mapped the orthologous relationships among 11 mammalian species using the current annotated genome. We identified 144 tree shrew-specific gene families, including interleukin 6 (IL6) and STT3 oligosaccharyltransferase complex catalytic subunit B (STT3B), which underwent significant changes in size. Comparison of the overall expression patterns in tissues and pathways across four species (human, rhesus monkey, tree shrew, and mouse) indicated that tree shrews are more similar to primates than to mice at the tissue-transcriptome level. Notably, the newly annotated purine rich element binding protein A (PURA) gene and the STT3B gene family showed dysregulation upon viral infection. The updated version of the tree shrew genome annotation (KIZ version 3: TS_3.0) is available at http://www. treeshrewdb.org and provides an essential reference for basic and biomedical studies using tree shrew animal models.
引用
收藏
页码:692 / 709
页数:18
相关论文
共 110 条
  • [1] Pathogenesis of Hepatitis C Virus Infection in Tupaia belangeri
    Amako, Yutaka
    Tsukiyama-Kohara, Kyoko
    Katsume, Asao
    Hirata, Yuichi
    Sekiguchi, Satoshi
    Tobita, Yoshimi
    Hayashi, Yukiko
    Hishima, Tsunekazu
    Funata, Nobuaki
    Yonekawa, Hiromichi
    Kohara, Michinori
    [J]. JOURNAL OF VIROLOGY, 2010, 84 (01) : 303 - 311
  • [2] Alternative splicing as a regulator of development and tissue identity
    Baralle, Francisco E.
    Giudice, Jimena
    [J]. NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2017, 18 (07) : 437 - 451
  • [3] Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data
    Beiki, H.
    Liu, H.
    Huang, J.
    Manchanda, N.
    Nonneman, D.
    Smith, T. P. L.
    Reecy, J. M.
    Tuggle, C. K.
    [J]. BMC GENOMICS, 2019, 20 (1)
  • [4] Broader impacts: international implications and integrative ethical consideration of policy decisions about US chimpanzee research
    Bennett, Allyson J.
    Panicker, Sangeeta
    [J]. AMERICAN JOURNAL OF PRIMATOLOGY, 2016, 78 (12) : 1282 - 1303
  • [5] Trimmomatic: a flexible trimmer for Illumina sequence data
    Bolger, Anthony M.
    Lohse, Marc
    Usadel, Bjoern
    [J]. BIOINFORMATICS, 2014, 30 (15) : 2114 - 2120
  • [6] Boutet Emmanuel, 2007, V406, P89
  • [7] Near-optimal probabilistic RNA-seq quantification (vol 34, pg 525, 2016)
    Bray, Nicolas L.
    Pimentel, Harold
    Melsted, Pall
    Pachter, Lior
    [J]. NATURE BIOTECHNOLOGY, 2016, 34 (08) : 888 - 888
  • [8] Integrating single-cell transcriptomic data across different conditions, technologies, and species
    Butler, Andrew
    Hoffman, Paul
    Smibert, Peter
    Papalexi, Efthymia
    Satija, Rahul
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (05) : 411 - +
  • [9] Gene expression across mammalian organ development
    Cardoso-Moreira, Margarida
    Halbert, Jean
    Valloton, Delphine
    Velten, Britta
    Chen, Chunyan
    Shao, Yi
    Liechti, Angelica
    Ascencao, Kelly
    Rummel, Coralie
    Ovchinnikova, Svetlana
    Mazin, Pavel V.
    Xenarios, Ioannis
    Harshman, Keith
    Mort, Matthew
    Cooper, David N.
    Sandi, Carmen
    Soares, Michael J.
    Ferreira, Paula G.
    Afonso, Sandra
    Carneiro, Miguel
    Turner, James M. A.
    VandeBerg, John L.
    Fallahshahroudi, Amir
    Jensen, Per
    Behr, Ruediger
    Lisgo, Steven
    Lindsay, Susan
    Khaitovich, Philipp
    Huber, Wolfgang
    Baker, Julie
    Anders, Simon
    Zhang, Yong E.
    Kaessmann, Henrik
    [J]. NATURE, 2019, 571 (7766) : 505 - +
  • [10] Characterizing and annotating the genome using RNA-seq data
    Chen, Geng
    Shi, Tieliu
    Shi, Leming
    [J]. SCIENCE CHINA-LIFE SCIENCES, 2017, 60 (02) : 116 - 125