Selecting genes by test statistics

被引:37
作者
Chen, DC
Liu, ZQ
Ma, XB
Hua, D
机构
[1] Uniformed Serv Univ Hlth Sci, Div Epidemiol & Biostat, Bethesda, MD 20814 USA
[2] TATRC, Bioinformat Cell, Frederick, MD 21703 USA
[3] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[4] George Washington Univ, Dept Comp Sci, Washington, DC 20052 USA
来源
JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY | 2005年 / 02期
基金
美国国家科学基金会;
关键词
D O I
10.1155/JBB.2005.132
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Gene selection is an important issue in analyzing multiclass microarray data. Among many proposed selection methods, the traditional ANOVA F test statistic has been employed to identify informative genes for both class prediction (classification) and discovery problems. However, the F test statistic assumes an equal variance. This assumption may not be realistic for gene expression data. This paper explores other alternative test statistics which can handle heterogeneity of the variances. We study five such test statistics, which include Brown-Forsythe test statistic and Welch test statistic. Their performance is evaluated and compared with that of F statistic over different classification methods applied to publicly available microarray datasets.
引用
收藏
页码:132 / 138
页数:7
相关论文
共 26 条
  • [1] Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
    Alizadeh, AA
    Eisen, MB
    Davis, RE
    Ma, C
    Lossos, IS
    Rosenwald, A
    Boldrick, JG
    Sabet, H
    Tran, T
    Yu, X
    Powell, JI
    Yang, LM
    Marti, GE
    Moore, T
    Hudson, J
    Lu, LS
    Lewis, DB
    Tibshirani, R
    Sherlock, G
    Chan, WC
    Greiner, TC
    Weisenburger, DD
    Armitage, JO
    Warnke, R
    Levy, R
    Wilson, W
    Grever, MR
    Byrd, JC
    Botstein, D
    Brown, PO
    Staudt, LM
    [J]. NATURE, 2000, 403 (6769) : 503 - 511
  • [2] SMALL SAMPLE BEHAVIOR OF SOME STATISTICS WHICH TEST EQUALITY OF SEVERAL MEANS
    BROWN, MB
    FORSYTHE, AB
    [J]. TECHNOMETRICS, 1974, 16 (01) : 129 - 132
  • [3] COCHRAN W. G., 1937, J. Roy. Statist. Soc. 1937., (Suppl.), V4, P102
  • [4] DANIEL WW, 1989, BIOSTATISTICS FDN AN
  • [5] Ding C.H.Q., 2002, RECOMB, P127
  • [6] Comparison of discrimination methods for the classification of tumors using gene expression data
    Dudoit, S
    Fridlyand, J
    Speed, TP
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) : 77 - 87
  • [7] Diversity of gene expression in adenocarcinoma of the lung
    Garber, ME
    Troyanskaya, OG
    Schluens, K
    Petersen, S
    Thaesler, Z
    Pacyna-Gengelbach, M
    van de Rijn, M
    Rosen, GD
    Perou, CM
    Whyte, RI
    Altman, RB
    Brown, PO
    Botstein, D
    Petersen, I
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (24) : 13784 - 13789
  • [8] Coupled two-way clustering analysis of gene microarray data
    Getz, G
    Levine, E
    Domany, E
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (22) : 12079 - 12084
  • [9] Ghosh Debashis, 2002, Pac Symp Biocomput, P18
  • [10] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537