Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data

被引:100
作者
Xu, L [1 ]
Tan, AC
Naiman, DQ
Geman, D
Winslow, RL
机构
[1] Johns Hopkins Univ, Whitaker Biomed Engn Inst, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Ctr Cardiovasc Bioinformat & Modeling, Baltimore, MD 21218 USA
[3] Johns Hopkins Univ, Dept Appl Math & Stat, Baltimore, MD 21218 USA
[4] Johns Hopkins Univ, Ctr Imaging Sci, Baltimore, MD 21218 USA
关键词
D O I
10.1093/bioinformatics/bti647
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: DNA microarray data analysis has been used previously to identify marker genes which discriminate cancer from normal samples. However, due to the limited sample size of each study, there are few common markers among different studies of the same cancer. With the rapid accumulation of microarray data, it is of great interest to integrate inter-study microarray data to increase sample size, which could lead to the discovery of more reliable markers. Results: We present a novel, simple method of integrating different microarray datasets to identify marker genes and apply the method to prostate cancer datasets. In this study, by applying a new statistical method, referred to as the top-scoring pair (TSP) classifier, we have identified a pair of robust marker genes (HPN and STAT6) by integrating microarray datasets from three different prostate cancer studies. Cross-platform validation shows that the TSP classifier built from the marker gene pair, which simply compares relative expression values, achieves high accuracy, sensitivity and specificity on independent datasets generated using various array platforms. Our findings suggest a new model for the discovery of marker genes from accumulated microarray data and demonstrate how the great wealth of microarray data can be exploited to increase the power of statistical analysis.
引用
收藏
页码:3905 / 3911
页数:7
相关论文
共 34 条
[11]   Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes [J].
Jiang, HY ;
Deng, YP ;
Chen, HS ;
Tao, L ;
Sha, QY ;
Chen, J ;
Tsai, CJ ;
Zhang, SL .
BMC BIOINFORMATICS, 2004, 5 (1)
[12]   Hepsin promotes prostate cancer progression and metastasis [J].
Klezovitch, O ;
Chevillet, J ;
Mirosevich, J ;
Roberts, RL ;
Matusik, RJ ;
Vasioukhin, V .
CANCER CELL, 2004, 6 (02) :185-195
[13]   Analysis of matched mRNA measurements from two different microarray technologies [J].
Kuo, WP ;
Jenssen, TK ;
Butte, AJ ;
Ohno-Machado, L ;
Kohane, IS .
BIOINFORMATICS, 2002, 18 (03) :405-412
[14]   A statistical method for identifying differential gene-gene co-expression patterns [J].
Lai, YL ;
Wu, BL ;
Chen, L ;
Zhao, HY .
BIOINFORMATICS, 2004, 20 (17) :3146-3155
[15]   Gene expression profiling identifies clinically relevant subtypes of prostate cancer [J].
Lapointe, J ;
Li, C ;
Higgins, JP ;
van de Rijn, M ;
Bair, E ;
Montgomery, K ;
Ferrari, M ;
Egevad, L ;
Rayford, W ;
Bergerheim, U ;
Ekman, P ;
DeMarzo, AM ;
Tibshirani, R ;
Botstein, D ;
Brown, PO ;
Brooks, JD ;
Pollack, JR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (03) :811-816
[16]  
LaTulippe E, 2002, CANCER RES, V62, P4499
[17]  
Luo J, 2001, CANCER RES, V61, P4683
[18]  
Magee JA, 2001, CANCER RES, V61, P5692
[19]   A comparison of oligonucleotide and cDNA-based microarray systems [J].
Mah, N ;
Thelin, A ;
Lu, T ;
Nikolaus, S ;
Kühbacher, T ;
Gurbuz, Y ;
Eickhoff, H ;
Klöppel, G ;
Lehrach, H ;
Mellgård, B ;
Costello, CM ;
Schreiber, S .
PHYSIOLOGICAL GENOMICS, 2004, 16 (03) :361-370
[20]   Estimating dataset size requirements for classifying DNA microarray data [J].
Mukherjee, S ;
Tamayo, P ;
Rogers, S ;
Rifkin, R ;
Engle, A ;
Campbell, C ;
Golub, TR ;
Mesirov, JP .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (02) :119-142