Inferential, robust non-negative matrix factorization analysis of microarray data

被引:46
作者
Fogel, Paul
Young, S. Stanley
Hawkins, Douglas M.
Ledirac, Nathalie
机构
[1] Natl Inst Stat Sci, Res Triangle Pk, NC 27709 USA
[2] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
[3] INRA, Ctr Rech, Lab Toxicol Cellulaire & Mol, F-06903 Sophia Antipolis, France
关键词
D O I
10.1093/bioinformatics/btl550
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Modern methods such as microarrays, proteomics and metabolomics often produce datasets where there are many more predictor variables than observations. Research in these areas is often exploratory; even so, there is interest in statistical methods that accurately point to effects that are likely to replicate. Correlations among predictors are used to improve the statistical analysis. We exploit two ideas: non-negative matrix factorization methods that create ordered sets of predictors; and statistical testing within ordered sets which is done sequentially, removing the need for correction for multiple testing within the set. Results: Simulations and theory point to increased statistical power. Computational algorithms are described in detail. The analysis and biological interpretation of a real dataset are given. In addition to the increased power, the benefit of our method is that the organized gene lists are likely to lead better understanding of the biology.
引用
收藏
页码:44 / 49
页数:6
相关论文
共 26 条
[1]  
[Anonymous], 1993, Resampling-based multiple testing: Examples and methods for P-value adjustment
[2]  
[Anonymous], ADV NEURAL INFORM PR
[3]   On the adaptive control of the false discovery fate in multiple testing with independent statistics [J].
Benjamini, Y ;
Hochberg, Y .
JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2000, 25 (01) :60-83
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   Genome-wide approach to identify risk factors for therapy-related myeloid leukemia [J].
Bogni, A ;
Cheng, C ;
Liu, W ;
Yang, W ;
Pfeffer, J ;
Mukatira, S ;
French, D ;
Downing, JR ;
Pui, CH ;
Relling, MV .
LEUKEMIA, 2006, 20 (02) :239-246
[6]   Metagenes and molecular pattern discovery using matrix factorization [J].
Brunet, JP ;
Tamayo, P ;
Golub, TR ;
Mesirov, JP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (12) :4164-4169
[7]  
Durbin B P, 2002, Bioinformatics, V18 Suppl 1, pS105
[8]  
Fisher R. A., 1925, STAT METHODS RES WOR
[9]   LOWER RANK APPROXIMATION OF MATRICES BY LEAST-SQUARES WITH ANY CHOICE OF WEIGHTS [J].
GABRIEL, KR ;
ZAMIR, S .
TECHNOMETRICS, 1979, 21 (04) :489-498
[10]   Improving molecular cancer class discovery through sparse non-negative matrix factorization [J].
Gao, Y ;
Church, G .
BIOINFORMATICS, 2005, 21 (21) :3970-3975