The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts

被引:48
作者
Yamasaki, Chisato [1 ,2 ]
Murakami, Katsuhiko [1 ,2 ]
Fujii, Yasuyuki [3 ]
Sato, Yoshiharu [1 ]
Harada, Erimi [1 ,2 ]
Takeda, Jun-Ichi [1 ,2 ]
Taniya, Takayuki [1 ,2 ]
Sakate, Ryuichi [1 ,2 ]
Kikugawa, Shingo [1 ,2 ]
Shimada, Makoto [1 ,2 ]
Tanino, Motohiko [4 ]
Koyanagi, Kanako O. [5 ]
Barrero, Roberto A.
Gough, Craig [1 ,2 ]
Chun, Hong-Woo [1 ,2 ]
Habara, Takuya [2 ]
Hanaoka, Hideki [7 ]
Hayakawa, Yosuke [2 ,8 ]
Hilton, Phillip B. [1 ,2 ]
Kaneko, Yayoi [9 ]
Kanno, Masako [1 ,2 ]
Kawahara, Yoshihiro [1 ,2 ]
Kawamura, Toshiyuki [10 ]
Matsuya, Akihiro [2 ,11 ]
Nagata, Naoki [12 ]
Nishikata, Kensaku [2 ,13 ]
Noda, Akiko Ogura [1 ,2 ]
Nurimoto, Shin [14 ]
Saichi, Naomi [1 ,2 ]
Sakai, Hiroaki [15 ]
Sanbonmatsu, Ryoko [1 ,2 ]
Shiba, Rie [1 ,2 ]
Suzuki, Mami [1 ,2 ]
Takabayashi, Kazuhiko
Takahashi, Aiko [1 ,2 ]
Tamura, Takuro [16 ]
Tanaka, Masayuki [1 ]
Tanaka, Susumu [17 ]
Todokoro, Fusano [2 ,18 ]
Yamaguchi, Kaori [2 ]
Yamamoto, Naoyuki [2 ]
Okido, Toshihisa [19 ,20 ]
Mashima, Jun [19 ,20 ]
Hashizume, Aki [19 ,20 ]
Jin, Lihua [19 ,20 ]
Lee, Kyung-Bum [19 ,20 ]
Lin, Yi-Chueh [19 ,20 ]
Nozaki, Asami [19 ,20 ]
Sakai, Katsunaga [19 ,20 ]
Tada, Masahito [20 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Biol Informat Res Ctr, Tokyo, Japan
[2] Japan Biol Informat Consortium, Japan Biol Informat Res Ctr, Tokyo, Japan
[3] Okayama Univ, Grad Sch Med, Dent & Pharmaceut Sci, Okayama, Japan
[4] DNA Chip Res Inc, Kanagawa, Japan
[5] Hokkaido Univ, Sapporo, Hokkaido 060, Japan
[6] Murdoch Univ, Ctr Comparat Genom, Murdoch, WA 6150, Australia
[7] Univ Tokyo, Biotechnol Res Ctr, Tokyo, Japan
[8] Hitachi Software Engn Co Ltd, Tokyo, Japan
[9] Mitsubishi Kagaku Inst Life Sci, Tokyo, Japan
[10] Fujitsu Ltd, Tokyo, Japan
[11] Hitachi Co Ltd, Hatoyama, Saitama, Japan
[12] Japan Sci & Technol Agcy, Tokyo, Japan
[13] NEC Soft Ltd, Tokyo, Japan
[14] Mitsui Knowledge Ind Co Ltd, Tokyo, Japan
[15] Natl Inst Agrobiol Sci, Ibaraki, Japan
[16] BITS Co Ltd, Shizuoka, Japan
[17] Tokyo Inst Psychiatry, Tokyo, Japan
[18] Dynacom Co Ltd, Chiba, Japan
[19] Natl Inst Genet, DNA Data Bank, Shizuoka, Japan
[20] Ctr Informat Biol, Shizuoka, Japan
[21] Tokyo Univ Sci, Chiba, Japan
[22] Univ Dublin, Trinity Coll, Dublin, Ireland
[23] Mitsubhishi Space Software Co Ltd, Ibaraki, Japan
[24] Natl Inst Genet, Div Populat Genet, Shizuoka, Japan
[25] EMBL Outstn Hinxton, European Bioinformat Inst, Cambridge, England
[26] Med Coll Wisconsin, Bioinformat Res Ctr, Milwaukee, WI USA
[27] Japan Atom Energy Agcy, Ctr Computat Sci & Engn, Kyoto, Japan
[28] Hitachi Ltd, Cent Res Lab, Hatoyama, Saitama, Japan
[29] Natl Inst Adv Ind Sci & Technol, Computat Biol Res Ctr, Tokyo, Japan
[30] Kazusa DNA Res Inst, Dept Human Gene, Chiba, Japan
[31] Weizmann Inst Sci, Dept Mol Genet, IL-76100 Rehovot, Israel
[32] CNRS, Villejuif, France
[33] Univ Paris 06, Villejuif, France
[34] Sino French Lab Lif Sci & Genom, Shanghai, Peoples R China
[35] CNRS, Ctr Gent Mol, Gif Sur Yvette, France
[36] Gif Orsay DNA Microarray Platform, Gif Sur Yvette, France
[37] IRCM, DSV, CEA, Lab Genomes Funct Explorat, Evry, France
[38] RIKEN, Yokohama Inst, Genom Sci Ctr, Kanagawa, Japan
[39] RIKEN, Wako Inst, Discovery & Res Inst, Genome Sci Lab, Saitama, Japan
[40] GSF, Natl Res Ctr Environm & Hlth, Inst Bioinformat, Neuherberg, Germany
[41] Idaho State Univ, Pocatello, ID 83209 USA
[42] Kyoto Univ, Inst Chem Res, Kyoto, Japan
[43] Univ Munster, Inst Bioinformat, Munster, Germany
[44] Kazusa DNA Res Inst, Chiba, Japan
[45] Korea Res Inst Biosci & Biotechnol, Taejon, South Korea
[46] Ludwig Inst Canc Res, Sao Paulo, Brazil
[47] Univ Iowa, Med Educ & Biomed Res Facility, Iowa City, IA 52242 USA
[48] Tokyo Med & Dent Univ, Inst Med Res, Tokyo, Japan
[49] German Canc Res Ctr, Mol Genome Anal, D-6900 Heidelberg, Germany
[50] Nagahama Inst Bio Sci & Technol, Shiga, Japan
基金
英国惠康基金;
关键词
D O I
10.1093/nar/gkm999
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, proteinprotein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.
引用
收藏
页码:D793 / D799
页数:7
相关论文
共 22 条
  • [1] JIGSAW: integration of multiple sources of evidence for gene prediction
    Allen, JE
    Salzberg, SL
    [J]. BIOINFORMATICS, 2005, 21 (18) : 3596 - 3603
  • [2] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94
  • [3] The transcriptional landscape of the mammalian genome
    Carninci, P
    Kasukawa, T
    Katayama, S
    Gough, J
    Frith, MC
    Maeda, N
    Oyama, R
    Ravasi, T
    Lenhard, B
    Wells, C
    Kodzius, R
    Shimokawa, K
    Bajic, VB
    Brenner, SE
    Batalov, S
    Forrest, ARR
    Zavolan, M
    Davis, MJ
    Wilming, LG
    Aidinis, V
    Allen, JE
    Ambesi-Impiombato, X
    Apweiler, R
    Aturaliya, RN
    Bailey, TL
    Bansal, M
    Baxter, L
    Beisel, KW
    Bersano, T
    Bono, H
    Chalk, AM
    Chiu, KP
    Choudhary, V
    Christoffels, A
    Clutterbuck, DR
    Crowe, ML
    Dalla, E
    Dalrymple, BP
    de Bono, B
    Della Gatta, G
    di Bernardo, D
    Down, T
    Engstrom, P
    Fagiolini, M
    Faulkner, G
    Fletcher, CF
    Fukushima, T
    Furuno, M
    Futaki, S
    Gariboldi, M
    [J]. SCIENCE, 2005, 309 (5740) : 1559 - 1563
  • [4] Predicting subcellular localization of proteins based on their N-terminal amino acid sequence
    Emanuelsson, O
    Nielsen, H
    Brunak, S
    von Heijne, G
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2000, 300 (04) : 1005 - 1016
  • [5] Pseudo-messenger RNA: Phantoms of the transcriptome
    Frith, Martin C.
    Wilming, Laurens G.
    Forrest, Alistair
    Kawaji, Hideya
    Tan, Sin Lam
    Wahlestedt, Claes
    Bajic, Vladimir B.
    Kai, Chikatoshi
    Kawai, Jun
    Carninci, Piero
    Hayashizaki, Yoshihide
    Bailey, Timothy L.
    Huminiecki, Lukasz
    [J]. PLOS GENETICS, 2006, 2 (04) : 504 - 514
  • [6] Origin of phenotypes: Genes and transcripts
    Gingeras, Thomas R.
    [J]. GENOME RESEARCH, 2007, 17 (06) : 682 - 690
  • [7] SOSUI: classification and secondary structure prediction system for membrane proteins
    Hirokawa, T
    Boon-Chieng, S
    Mitaku, S
    [J]. BIOINFORMATICS, 1998, 14 (04) : 378 - 379
  • [8] WoLF PSORT: protein localization predictor
    Horton, Paul
    Park, Keun-Joon
    Obayashi, Takeshi
    Fujita, Naoya
    Harada, Hajime
    Adams-Collier, C. J.
    Nakai, Kenta
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : W585 - W587
  • [9] Integrative annotation of 21,037 human genes validated by full-length cDNA clones
    Imanishi, T
    Itoh, T
    Suzuki, Y
    O'Donovan, C
    Fukuchi, S
    Koyanagi, KO
    Barrero, RA
    Tamura, T
    Yamaguchi-Kabata, Y
    Tanino, M
    Yura, K
    Miyazaki, S
    Ikeo, K
    Homma, K
    Kasprzyk, A
    Nishikawa, T
    Hirakawa, M
    Thierry-Mieg, J
    Thierry-Mieg, D
    Ashurst, J
    Jia, LB
    Nakao, M
    Thomas, MA
    Mulder, N
    Karavidopoulou, Y
    Jin, LH
    Kim, S
    Yasuda, T
    Lenhard, B
    Eveno, E
    Suzuki, Y
    Yamasaki, C
    Takeda, J
    Gough, C
    Hilton, P
    Fujii, Y
    Sakai, H
    Tanaka, S
    Amid, C
    Bellgard, M
    Bonaldo, MD
    Bono, H
    Bromberg, SK
    Brookes, AJ
    Bruford, E
    Carninci, P
    Chelala, C
    Couillault, C
    de Souza, SJ
    Debily, MA
    [J]. PLOS BIOLOGY, 2004, 2 (06) : 856 - 875
  • [10] HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project
    Kikuno, R
    Nagase, T
    Waki, M
    Ohara, O
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 166 - 168