Using a new GPI-anchored-protein identification system to mine the protein databases of Aspergillus fumigatus, Aspergillus nidulans, and Aspergillus oryzae

被引:11
作者
Cao, Wei [1 ]
Maruyama, Jun-ichi [1 ]
Kitamoto, Katsuhiko [1 ]
Sumikoshi, Kazuya [1 ]
Terada, Tohru [2 ]
Nakamura, Shugo [1 ]
Shimizu, Kentaro [1 ]
机构
[1] Univ Tokyo, Grad Sch Agr & Life Sci, Dept Biotechnol, Bunkyo Ku, Tokyo 1138657, Japan
[2] Univ Tokyo, Grad Sch Agr & Life Sci, Agr Bioinformat Res Unit, Bunkyo Ku, Tokyo 1138657, Japan
关键词
Aspergillus fumigatus; Aspergillus nidulans; Aspergillus oryzae; GPI; SVM; SACCHAROMYCES-CEREVISIAE; CANDIDA-ALBICANS; GENOME-WIDE; MEMBRANE-PROTEINS; SIGNAL PEPTIDES; OMEGA-SITE; GLYCOSYLPHOSPHATIDYLINOSITOL; PREDICTION; SEQUENCE; GENE;
D O I
10.2323/jgam.55.381
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Computational approaches provide valuable information to start experimental surveys identifying glycosylphosphatidylinositol (GPI)-anchored proteins in protein sequence databases. We developed a new sequence-based identification system that uses an optimized classifier based on a support vector machine (SVM) algorithm to recognize appropriate COOH-terminal sequences and uses a classifier implementing a simple majority voting strategy to recognize appropriate NH2-terminal sequences. The SVM classifier showed high accuracy (96%) in 5-fold cross-validation testing, and the majority voting classifier showed high recall (98.88%) when applied to it test dataset of eukaryote proteins. When applied to S. cerevisiae protein sequences, the new identification system showed good ability to classify "unseen" data. Applying our system to protein sequences of three aspergilli, we identified 115 GPI-anchored proteins in Aspergillus fumigatus, 129 in Aspergillus nidulans, and 136 in Aspergillus oryzae. Sequence-based conserved domain search found nearly half of these proteins to have conserved domains that covered a wide range of functions.
引用
收藏
页码:381 / 393
页数:13
相关论文
共 59 条
  • [31] IKEZAWA H, 1976, BIOCHIM BIOPHYS ACTA, V450, P154
  • [32] Loss of a GPI-anchored membrane protein Aah3p causes a defect in vacuolar protein sorting in Schizosaccharomyces pombe
    Iwaki, Tomoko
    Morita, Tomotake
    Tanaka, Naotaka
    Giga-Hama, Yuko
    Takegawa, Kaoru
    [J]. BIOSCIENCE BIOTECHNOLOGY AND BIOCHEMISTRY, 2007, 71 (02) : 623 - 626
  • [33] Two homologous genes, DCW1 (YKL046c) and DFG5, are essential for cell growth and encode glycosylphosphatidylinositol (GPI)-anchored membrane proteins required for cell wall biogenesis in Saccharomyces cerevisiae
    Kitagaki, H
    Wu, H
    Shimoi, H
    Ito, K
    [J]. MOLECULAR MICROBIOLOGY, 2002, 46 (04) : 1011 - 1022
  • [34] Klis FM, 2002, FEMS MICROBIOL REV, V26, P239, DOI 10.1111/j.1574-6976.2002.tb00613.x
  • [35] KODUKULA K, 1995, METHOD ENZYMOL, V250, P536
  • [36] A SIMPLE METHOD FOR DISPLAYING THE HYDROPATHIC CHARACTER OF A PROTEIN
    KYTE, J
    DOOLITTLE, RF
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1982, 157 (01) : 105 - 132
  • [37] Glycosylphosphatidylinositol (GPI) anchor is required in Aspergillus fumigatus for morphogenesis and virulence
    Li, Hong
    Zhou, Hui
    Luo, Yuanming
    Ouyang, Haomiao
    Hu, Hongyan
    Jin, Cheng
    [J]. MOLECULAR MICROBIOLOGY, 2007, 64 (04) : 1014 - 1027
  • [38] Genome sequencing and analysis of Aspergillus oryzae
    Machida, M
    Asai, K
    Sano, M
    Tanaka, T
    Kumagai, T
    Terai, G
    Kusumoto, KI
    Arima, T
    Akita, O
    Kashiwagi, Y
    Abe, K
    Gomi, K
    Horiuchi, H
    Kitamoto, K
    Kobayashi, T
    Takeuchi, M
    Denning, DW
    Galagan, JE
    Nierman, WC
    Yu, JJ
    Archer, DB
    Bennett, JW
    Bhatnagar, D
    Cleveland, TE
    Fedorova, ND
    Gotoh, O
    Horikawa, H
    Hosoyama, A
    Ichinomiya, M
    Igarashi, R
    Iwashita, K
    Juvvadi, PR
    Kato, M
    Kato, Y
    Kin, T
    Kokubun, A
    Maeda, H
    Maeyama, N
    Maruyama, J
    Nagasaki, H
    Nakajima, T
    Oda, K
    Okada, K
    Paulsen, I
    Sakamoto, K
    Sawano, T
    Takahashi, M
    Takase, K
    Terabayashi, Y
    Wortman, JR
    [J]. NATURE, 2005, 438 (7071) : 1157 - 1161
  • [39] CDD: a conserved domain database for protein classification
    Marchler-Bauer, A
    Anderson, JB
    Cherukuri, PF
    DeWweese-Scott, C
    Geer, LY
    Gwadz, M
    He, SQ
    Hurwitz, DI
    Jackson, JD
    Ke, ZX
    Lanczycki, CJ
    Liebert, CA
    Liu, CL
    Lu, F
    Marchler, GH
    Mullokandov, M
    Shoemaker, BA
    Simonyan, V
    Song, JS
    Thiessen, PA
    Yamashita, RA
    Yin, JJ
    Zhang, DC
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D192 - D196
  • [40] A comparison of signal sequence prediction methods using a test set of signal peptides
    Menne, KML
    Hermjakob, H
    Apweiler, R
    [J]. BIOINFORMATICS, 2000, 16 (08) : 741 - 742