Generic eukaryotic core promoter prediction using structural features of DNA

被引:164
作者
Abeel, Thomas [1 ,2 ]
Saeys, Yvan [1 ,2 ]
Bonnet, Eric [1 ,2 ]
Rouze, Pierre [1 ,2 ,3 ]
Van de Peer, Yves [1 ,2 ]
机构
[1] VIB, Flanders Inst Biotechnol, Dept Plant Syst Biol, B-9052 Ghent, Belgium
[2] Univ Ghent, Dept Mol Genet, B-9052 Ghent, Belgium
[3] Univ Ghent, Lab Associe IINRA France, B-9052 Ghent, Belgium
关键词
D O I
10.1101/gr.6991408
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Despite many recent efforts, in silico identification of promoter regions is still in its infancy. However, the accurate identification and delineation of promoter regions is important for several reasons, such as improving genome annotation and devising experiments to study and understand transcriptional regulation. Current methods to identify the core region of promoters require large amounts of high-quality training data and often behave like black box models that output predictions that are difficult to interpret. Here, we present a novel approach for predicting promoters in whole-genome sequences by using large-scale structural properties of DNA. Our technique requires no training, is applicable to many eukaryotic genomes, and performs extremely well in comparison with the best available promoter prediction programs. Moreover, it is fast, simple in design, and has no size constraints, and the results are easily interpretable. We compared our approach with 14 current state-of-the-art implementations using human gene and transcription start site data and analyzed the ENCODE region in more detail. We also validated our method on 12 additional eukaryotic genomes, including vertebrates, invertebrates, plants, fungi, and protists.
引用
收藏
页码:310 / 323
页数:14
相关论文
共 114 条
[1]   Comprehensive analysis of the base composition around the transcription start site in Metazoa [J].
Aerts, S ;
Thijs, G ;
Dabrowski, M ;
Moreau, Y ;
De Moor, B .
BMC GENOMICS, 2004, 5 (1)
[2]   EXPRESSION OF RIBOSOMAL-PROTEIN GENES AND REGULATION OF RIBOSOME BIOSYNTHESIS IN XENOPUS DEVELOPMENT [J].
AMALDI, F ;
BOZZONI, I ;
BECCARI, E ;
PIERANDREIAMALDI, P .
TRENDS IN BIOCHEMICAL SCIENCES, 1989, 14 (05) :175-178
[3]   Characterization and predictive discovery of evolutionarily conserved mammalian alternative promoters [J].
Baek, Daehyun ;
Davis, Colleen ;
Ewing, Brent ;
Gordon, David ;
Green, Phil .
GENOME RESEARCH, 2007, 17 (02) :145-155
[4]   Promoter prediction analysis on the whole human genome [J].
Bajic, VB ;
Tan, SL ;
Suzuki, Y ;
Sugano, S .
NATURE BIOTECHNOLOGY, 2004, 22 (11) :1467-1473
[5]   Dragon Promoter Finder: recognition of vertebrate RNA polymerase II promoters [J].
Bajic, VB ;
Seah, SH ;
Chong, A ;
Zhang, GL ;
Koh, JLY ;
Brusic, V .
BIOINFORMATICS, 2002, 18 (01) :198-199
[6]   Computer model for recognition of functional transcription start sites in RNA polymerase II promoters of vertebrates [J].
Bajic, VB ;
Seah, SH ;
Chong, A ;
Krishnan, SPT ;
Koh, JLY ;
Brusic, V .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2003, 21 (05) :323-332
[7]   Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment [J].
Bajic, Vladimir B. ;
Brent, Michael R. ;
Brown, Randall H. ;
Frankish, Adam ;
Harrow, Jennifer ;
Ohler, Uwe ;
Solovyev, Victor V. ;
Tan, Sin Lam .
GENOME BIOLOGY, 2006, 7 (Suppl 1)
[8]   Mice and men:: Their promoter properties [J].
Bajic, Vladimir B. ;
Tan, Sin Lam ;
Christoffels, Alan ;
Schonbach, Christian ;
Lipovich, Leonard ;
Yang, Liang ;
Hofmann, Oliver ;
Kruger, Adele ;
Hide, Winston ;
Kai, Chikatoshi ;
Kawai, Jun ;
Hume, David A. ;
Carninci, Piero ;
Hayashizaki, Yoshihide .
PLOS GENETICS, 2006, 2 (04) :614-626
[9]  
Baldi P, 1998, Proc Int Conf Intell Syst Mol Biol, V6, P35
[10]   MicroRNAs: Genomics, biogenesis, mechanism, and function (Reprinted from Cell, vol 116, pg 281-297, 2004) [J].
Bartel, David P. .
CELL, 2007, 131 (04) :11-29