A MULTIVARIATE ADAPTIVE STOCHASTIC SEARCH METHOD FOR DIMENSIONALITY REDUCTION IN CLASSIFICATION

被引:3
作者
Tian, Tian Siva [1 ]
James, Gareth M. [2 ]
Wilcox, Rand R. [3 ]
机构
[1] Univ Houston, Dept Psychol, Houston, TX 77204 USA
[2] Univ So Calif, Dept Informat & Operat Management, Los Angeles, CA 90089 USA
[3] Univ So Calif, Dept Psychol, Los Angeles, CA 90089 USA
关键词
Dimensionality reduction; classification; variable selection; variable combination; Lasso; VARIABLE SELECTION; SHRINKAGE; CANCER;
D O I
10.1214/09-AOAS284
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
High-dimensional classification has become an increasingly important problem. In this paper we propose a "Multivariate Adaptive Stochastic Search" (MASS) approach which first reduces the dimension of the data space and then applies a standard classification method to the reduced space. One key advantage of MASS is that it automatically adjusts to mimic variable selection type methods, such as the Lasso, variable combination methods, such as PCA, or methods that combine these two approaches. The adaptivity of MASS allows it to perform well in situations where pure variable selection or variable combination methods fail. Another major advantage of our approach is that MASS can accurately project the data into very low-dimensional non-linear, as well as linear, spaces. MASS uses a stochastic search algorithm to select a handful of optimal projection directions from a large number of random directions in each iteration. We provide some theoretical justification for MASS and demonstrate its strengths on an extensive range of simulation studies and real world data sets by comparing it to many classical and modern classification methods.
引用
收藏
页码:340 / 365
页数:26
相关论文
共 24 条
[1]  
[Anonymous], 2003, Simulation-Based Optimization: Parametric Optimization Tech- niques Reinforcement Learning
[2]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[3]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[4]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[5]   HIGH-DIMENSIONAL CLASSIFICATION USING FEATURES ANNEALED INDEPENDENCE RULES [J].
Fan, Jianqing ;
Fan, Yingying .
ANNALS OF STATISTICS, 2008, 36 (06) :2605-2637
[6]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[7]   The multivariate g-and-h distribution [J].
Field, C ;
Genton, MG .
TECHNOMETRICS, 2006, 48 (01) :104-111
[8]   VARIABLE SELECTION VIA GIBBS SAMPLING [J].
GEORGE, EI ;
MCCULLOCH, RE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (423) :881-889
[9]  
George EI, 1997, STAT SINICA, V7, P339
[10]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537