The abundance of short proteins in the mammalian proteome

被引:160
作者
Frith, Martin C.
Forrest, Alistair R.
Nourbakhsh, Ehsan
Pang, Ken C.
Kai, Chikatoshi
Kawai, Jun
Carninci, Piero
Hayashizaki, Yoshihide
Bailey, Timothy L.
Grimmond, Sean M. [1 ]
机构
[1] Univ Queensland, Inst Mol Biosci, Brisbane, Qld, Australia
[2] RIKEN, Genom Sci Ctr, Yokohama Inst, Genome Explorat Res Grp,Genome Network Project Co, Yokohama, Kanagawa, Japan
[3] Ludwig Inst Canc Res, Austin & Repatriat Med Ctr, Heidelberg, Vic, Australia
[4] RIKEN, Genome Sci Lab, Discovery Res Inst, Wako Inst, Wako, Saitama 35101, Japan
来源
PLOS GENETICS | 2006年 / 2卷 / 04期
关键词
D O I
10.1371/journal.pgen.0020052
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Short proteins play key roles in cell signalling and other processes, but their abundance in the mammalian proteome is unknown. Current catalogues of mammalian proteins exhibit an artefactual discontinuity at a length of 100 aa, so that protein abundance peaks just above this length and falls off sharply below it. To clarify the abundance of short proteins, we identify proteins in the FANTOM collection of mouse cDNAs by analysing synonymous and nonsynonymous substitutions with the computer program CRITICA. This analysis confirms that there is no real discontinuity at length 100. Roughly 10% of mouse proteins are shorter than 100 aa, although the majority of these are variants of proteins longer than 100 aa. We identify many novel short proteins, including a "dark matter'' subset containing ones that lack detectable homology to other known proteins. Translation assays confirm that some of these novel proteins can be translated and localised to the secretory pathway.
引用
收藏
页码:515 / 528
页数:14
相关论文
共 45 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 1996, RepeatMasker
[3]   CRITICA: Coding region identification tool invoking comparative analysis [J].
Badger, JH ;
Olsen, GJ .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (04) :512-524
[4]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[5]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[6]   Ultraconserved elements in the human genome [J].
Bejerano, G ;
Pheasant, M ;
Makunin, I ;
Stephen, S ;
Kent, WJ ;
Mattick, JS ;
Haussler, D .
SCIENCE, 2004, 304 (5675) :1321-1325
[7]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[8]   Endogenous Msx1 antisense transcript:: In vivo and in vitro evidences, structure, and potential involvement in skeleton development in mammals [J].
Blin-Wakkach, C ;
Lezot, F ;
Ghoul-Mazgar, S ;
Hotton, D ;
Monteiro, S ;
Teillaud, C ;
Pibouin, L ;
Orestes-Cardoso, S ;
Papagerakis, P ;
Macdougall, M ;
Robert, B ;
Berdal, A .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (13) :7336-7341
[9]   Recent advances in gene structure prediction [J].
Brent, MR ;
Guigó, R .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (03) :264-272
[10]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94