Mining housekeeping genes with a Naive Bayes classifier

被引:43
作者
De Ferrari, Luna [1 ]
Aitken, Stuart [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9LE, Midlothian, Scotland
关键词
D O I
10.1186/1471-2164-7-277
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Traditionally, housekeeping and tissue specific genes have been classified using direct assay of mRNA presence across different tissues, but these experiments are costly and the results not easy to compare and reproduce. Results: In this work, a Naive Bayes classifier based only on physical and functional characteristics of genes already available in databases, like exon length and measures of chromatin compactness, has achieved a 97% success rate in classification of human housekeeping genes ( 93% for mouse and 90% for fruit fly). Conclusion: The newly obtained lists of housekeeping and tissue specific genes adhere to the expected functions and tissue expression patterns for the two classes. Overall, the classifier shows promise, and in the future additional attributes might be included to improve its discriminating power.
引用
收藏
页数:14
相关论文
共 31 条
[1]  
[Anonymous], 2005, Data Mining Pratical Machine Learning Tools and Techniques
[2]  
[Anonymous], 1993, P 13 INT JOINT C ART
[3]   Minimum information about a microarray experiment (MIAME) - toward standards for microarray data [J].
Brazma, A ;
Hingamp, P ;
Quackenbush, J ;
Sherlock, G ;
Spellman, P ;
Stoeckert, C ;
Aach, J ;
Ansorge, W ;
Ball, CA ;
Causton, HC ;
Gaasterland, T ;
Glenisson, P ;
Holstege, FCP ;
Kim, IF ;
Markowitz, V ;
Matese, JC ;
Parkinson, H ;
Robinson, A ;
Sarkans, U ;
Schulze-Kremer, S ;
Stewart, J ;
Taylor, R ;
Vilo, J ;
Vingron, M .
NATURE GENETICS, 2001, 29 (04) :365-371
[4]   Further defining housekeeping, or "maintenance," genes Focus on "A compendium of gene expression in normal human tissues" [J].
Butte, AJ ;
Dzau, VJ ;
Glueck, SB .
PHYSIOLOGICAL GENOMICS, 2001, 7 (02) :95-96
[5]   Selection for short introns in highly expressed genes [J].
Castillo-Davis, CI ;
Mekhedov, SL ;
Hartl, DL ;
Koonin, EV ;
Kondrashov, FA .
NATURE GENETICS, 2002, 31 (04) :415-418
[6]  
DEFERRARI L, 2005, THESIS U EDINBURGH
[7]   On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130
[8]  
Dougherty J., 1995, MACHINE LEARNING P 1, P194, DOI DOI 10.1016/B978-1-55860-377-6.50032-3
[9]   Human housekeeping genes are compact [J].
Eisenberg, E ;
Levanon, EY .
TRENDS IN GENETICS, 2003, 19 (07) :362-365
[10]  
*EMSMART, ENSMART BIOMART EBI