ON THE GRADIENT-BASED ALGORITHM FOR MATRIX FACTORIZATION APPLIED TO DIMENSIONALITY REDUCTION

被引:0
作者
Nikulin, Vladimir [1 ]
McLachlan, Geoffrey J. [1 ]
机构
[1] Univ Queensland, Dept Math, Brisbane, Qld, Australia
来源
BIONFORMATICS 2010: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON BIOINFORMATICS | 2010年
关键词
Matrix factorisation; Gradient-based optimisation; Cross-validation; Gene expression data; GENE-EXPRESSION DATA; MICROARRAY DATA; CLASSIFICATION; SELECTION;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The high dimensionality of microarray data, the expressions of thousands of genes in a much smaller number of samples, presents challenges that affect the applicability of the analytical results. In principle, it would be better to describe the data in terms of a small number of metagenes, derived as a result of matrix factorisation, which could reduce noise while still capturing the essential features of the data. We propose a fast and general method for matrix factorization which is based on decomposition by parts that can reduce the dimension of expression data from thousands of genes to several factors. Unlike classification and regression, matrix decomposition requires no response variable and thus falls into category of unsupervised learning methods. We demonstrate the effectiveness of this approach to the supervised classification of gene expression data.
引用
收藏
页码:147 / 152
页数:6
相关论文
共 15 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[3]  
[Anonymous], FEATURE SELECTION DA
[4]  
[Anonymous], P NIPS 2000
[5]   MULTINOMIAL LOGISTIC-REGRESSION ALGORITHM [J].
BOHNING, D .
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1992, 44 (01) :197-200
[6]   Metagenes and molecular pattern discovery using matrix factorization [J].
Brunet, JP ;
Tamayo, P ;
Golub, TR ;
Mesirov, JP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (12) :4164-4169
[7]   A Regularized Method for Selecting Nested Groups of Relevant Genes from Microarray Data [J].
De Mol, Christine ;
Mosci, Sofia ;
Traskine, Magali ;
Verri, Alessandro .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2009, 16 (05) :677-690
[8]   Boosting for tumor classification with gene expression data [J].
Dettling, M ;
Bühlmann, P .
BIOINFORMATICS, 2003, 19 (09) :1061-1069
[9]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[10]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422