Exploration of novel motifs derived from mouse cDNA sequences

被引:21
作者
Kawaji, H
Schönbach, C
Matsuo, Y
Kawai, J
Okazaki, Y
Hayashizaki, Y
Matsuda, H [1 ]
机构
[1] RIKEN, Genom Sci Ctr, Bioinformat Grp, Computat Genom Team,GSC, Yokohama, Kanagawa 2300045, Japan
[2] Osaka Univ, Grad Sch Engn Sci, Dept Informat & Math Sci, Toyonaka, Osaka 5608531, Japan
[3] Nippon Telegraph & Tel Software Corp, Yokohama, Kanagawa 2318554, Japan
[4] RIKEN, Genom Sci Ctr, Lab Genome Explorat Res Grp, GSC, Yokohama, Kanagawa 2300045, Japan
关键词
D O I
10.1101/gr.193702
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We performed a systematic maximum density subgraph CMDS) detection of conserved sequence regions to discover new, biologically relevant motifs from a set of 21,050 conceptually translated mouse cDNA (FANTOMI) sequences. A total of 3202 candidate sequences, which shared similar regions over >20 amino acid residues, were screened against known conserved regions listed in Pfam, ProDom, and InterPro. The filtering procedure resulted in 139 FANTOMI sequences belonging to 49 new motif candidates. Using annotations and multiple sequence alignment information, we removed by visual inspection 42 candidates whose members were found to be false positives because of sequence redundancy, alternative splicing, low complexity, transcribed retroviral repeat elements contained in the region of the predicted open reading frame, and reports in the literature. The remaining seven motifs have been expanded by hidden Markov model (HMM) profile searches of SWISS-PROT/TrEMBL from 28 FANTOMI sequences to 164 members and analyzed in detail on sequence and structure level to elucidate the possible functions of motifs and members. The novel and conserved motif MDS00105 is specific for the mammalian inhibitor of growth (ING) family. Three submotifs MDS00105.1-3 are specific for INGI/INGIL, ING1-homolog, and ING3 subfamilies. The motif MDS00105 together with a PHD finger domain constitutes a module for ING proteins. Structural motif MDS00113 represents a leucine zipper-like motif. Conserved motif MDS00145 is a novel 1-acyl-SN-glycerol-3-phosphate acyltransferase (AGPAT) submotif containing a transmembrane domain that distinguishes AGPAT3 and AGPAT4 from all other acyltransferase domain-containing proteins. Functional motif MDS00148 overlaps with the kazal-type serine protease inhibitor domain but has been detected only in an extracellular loop region of solute carrier 21 (SLC21) (organic anion transporters) family members, which may regulate the specificity of anion uptake. Our motif discovery not only aided in the functional characterization of new mouse orthologs for potential drug targets but also allowed us to predict that at least 16 other new motifs are waiting to be discovered from the current SWISS-PROT/TrEMBL database.
引用
收藏
页码:367 / 378
页数:12
相关论文
共 64 条
[1]   THE PHD FINGER - IMPLICATIONS FOR CHROMATIN-MEDIATED TRANSCRIPTIONAL REGULATION [J].
AASLAND, R ;
GIBSON, TJ ;
STEWART, AF .
TRENDS IN BIOCHEMICAL SCIENCES, 1995, 20 (02) :56-59
[2]   Molecular characterization and tissue distribution of a new organic anion transporter subtype (oatp3) that transports thyroid hormones and taurocholate and comparison with oatp2 [J].
Abe, T ;
Kakyo, M ;
Sakagami, H ;
Tokui, T ;
Nishio, T ;
Tanemoto, M ;
Nomura, H ;
Hebert, SC ;
Matsuno, S ;
Kondo, H ;
Yawo, H .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1998, 273 (35) :22395-22401
[3]  
ADACHI J, 1996, COMPUTER MATH TOKYO, V28
[4]   Characterization of a human lysophosphatidic acid acyltransferase that is encoded by a gene located in the class III region of the human major histocompatibility complex [J].
Aguado, B ;
Campbell, RD .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1998, 273 (07) :4096-4105
[5]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[6]   InterPro - an integrated documentation resource for protein families, domains and functional sites [J].
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Birney, E ;
Biswas, M ;
Bucher, P ;
Cerutti, L ;
Corpet, F ;
Croning, MDR ;
Durbin, R ;
Falquet, L ;
Fleischmann, W ;
Gouzy, J ;
Hermjakob, H ;
Hulo, N ;
Jonassen, I ;
Kahn, D ;
Kanapin, A ;
Karavidopoulou, Y ;
Lopez, R ;
Marx, B ;
Mulder, NJ ;
Oinn, TM ;
Pagni, M ;
Servant, F ;
Sigrist, CJA ;
Zdobnov, EM .
BIOINFORMATICS, 2000, 16 (12) :1145-1150
[7]   A natural classification of the basic helix-loop-helix class of transcription factors [J].
Atchley, WR ;
Fitch, WM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (10) :5172-5176
[8]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[9]   Proteomics: new perspectives, new biomedical opportunities [J].
Banks, RE ;
Dunn, MJ ;
Hochstrasser, DF ;
Sanchez, JC ;
Blackstock, W ;
Pappin, DJ ;
Selby, PJ .
LANCET, 2000, 356 (9243) :1749-1756
[10]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]