Microarray and EST database estimates of mRNA expression levels differ:: The protein length versus expression curve for C-elegans -: art. no. 30

被引:24
作者
Munoz, ET
Bogarad, LD
Deem, MW [1 ]
机构
[1] Rice Univ, Dept Bioengn, Houston, TX 77005 USA
[2] Rice Univ, Dept Phys & Astron, Houston, TX 77005 USA
关键词
D O I
10.1186/1471-2164-5-30
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases in the estimation of gene expression rates from microarray data and from abundance within the Expressed Sequence Tag (EST) database. We suggest that length is a significant factor in biases to measured gene expression rates. As a specific example of the importance of the bias of expression rate with length, we address the following evolutionary question: Does the average C. elegans protein length increase or decrease with expression level? Two different answers to this question have been reported in the literature, one method using expression levels estimated by abundance within the EST database and another using microarrays. We have investigated this issue by constructing the full protein length versus expression curve for C. elegans, using both methods for estimating expression levels. Results: The microarray data show a monotonic decrease of length with expression level, whereas the abundance within the EST database data show a non-monotonic behavior. Furthermore, the ratio of the expression level estimated by the EST database to that measured by microarrays is not constant, but rather systematically biased with gene length. Conclusions: It is suggested that the length bias may lie primarily in the abundance within the EST database method, being not ameliorated by internal standards as it is in the microarray data, and that this bias should be removed before data interpretation. When this is done, both the microarray and the abundance within the EST database give a monotonic decrease of spliced length with expression level, and the correlation between the EST and microarray data becomes larger. We suggest that standard RNA controls be used to normalize for length bias in any method that measures expression.
引用
收藏
页数:6
相关论文
共 35 条
[1]   Assessment of reliability of microarray data and estimation of signal thresholds using mixture modeling [J].
Asyali, MH ;
Shoukri, MM ;
Demirkaya, O ;
Khabar, KSA .
NUCLEIC ACIDS RESEARCH, 2004, 32 (08) :2323-2335
[2]   Exploring the new world of the genome with DNA microarrays [J].
Brown, PO ;
Botstein, D .
NATURE GENETICS, 1999, 21 (Suppl 1) :33-37
[3]   Selection for short introns in highly expressed genes [J].
Castillo-Davis, CI ;
Mekhedov, SL ;
Hartl, DL ;
Koonin, EV ;
Kondrashov, FA .
NATURE GENETICS, 2002, 31 (04) :415-418
[4]  
Coghlan A, 2000, YEAST, V16, P1131, DOI 10.1002/1097-0061(20000915)16:12<1131::AID-YEA609>3.0.CO
[5]  
2-F
[6]   MicroSAGE: a modified procedure for serial analysis of gene expression in limited amounts of tissue [J].
Datson, NA ;
van der Perk-de Jong, J ;
van den Berg, MP ;
de Kloet, ER ;
Vreugdenhil, E .
NUCLEIC ACIDS RESEARCH, 1999, 27 (05) :1300-1307
[7]   Expression profiling using cDNA microarrays [J].
Duggan, DJ ;
Bittner, M ;
Chen, YD ;
Meltzer, P ;
Trent, JM .
NATURE GENETICS, 1999, 21 (Suppl 1) :10-14
[8]   Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, Arabidopsis [J].
Duret, L ;
Mouchiroud, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (08) :4482-4487
[9]   Synonymous codon bias is related to gene length in Escherichia coli: Selection for translational accuracy? [J].
EyreWalker, A .
MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (06) :864-872
[10]  
Gitton Y, 2002, NATURE, V420, P586, DOI 10.1038/nature01178