Exact and approximate algorithms for variable selection in linear discriminant analysis

被引:13
作者
Brusco, Michael J. [1 ]
Steinley, Douglas [2 ]
机构
[1] Florida State Univ, Coll Business, Dept Mkt, Tallahassee, FL 32306 USA
[2] Univ Missouri Columbia, Columbia, MO USA
关键词
Linear discriminant analysis; Variable selection; Branch and bound; Tabu search; WELL-FORMULATED SUBSETS; POLYNOMIAL REGRESSION; MULTIPLE MEASUREMENTS; MULTIVARIATE-ANALYSIS; TABU SEARCH; MODELS; STEPWISE;
D O I
10.1016/j.csda.2010.05.027
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Variable selection is a venerable problem in multivariate statistics. In the context of discriminant analysis, the goal is to select a subset of variables that accomplishes one of two objectives: (1) the provision of a parsimonious, yet descriptive, representation of group structure, or (2) the ability to correctly allocate new cases to groups. We present an exact (branch-and-bound) algorithm for variable selection in linear discriminant analysis that identifies subsets of variables that minimize Wilks' A. An important feature of this algorithm is a variable reordering scheme that greatly reduces computation time. We also present an approximate procedure based on tabu search, which can be implemented for a variety of objective criteria designed for either the descriptive or allocation goals associated with discriminant analysis. The tabu search heuristic is especially useful for maximizing the hit ratio (i.e., the percentage of correctly classified cases). Computational results for the proposed methods are provided for two data sets from the literature. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:123 / 131
页数:9
相关论文
共 58 条
[21]  
Fujikoshi Y., 1983, HIROSHIMA MATH J, V13, P203
[22]   ALL POSSIBLE REGRESSIONS WITH LESS COMPUTATION [J].
FURNIVAL, GM .
TECHNOMETRICS, 1971, 13 (02) :403-&
[23]   REGRESSIONS BY LEAPS AND BOUNDS [J].
FURNIVAL, GM ;
WILSON, RW .
TECHNOMETRICS, 1974, 16 (04) :499-511
[24]   Branch-and-bound algorithms for computing the best-subset regression models [J].
Gatu, C ;
Kontoghiorghes, EJ .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2006, 15 (01) :139-156
[25]   SELECTION OF VARIABLES IN DISCRIMINANT-ANALYSIS BY F-STATISTIC AND ERROR RATE [J].
HABBEMA, JDF ;
HERMANS, J .
TECHNOMETRICS, 1977, 19 (04) :487-493
[26]  
Hand DavidJ., 1981, Discrimination and classification
[27]  
Hansen Pierre., 2003, Handbook of Metaheuristics, V57, P145
[28]   Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach [J].
Hogarty, KY ;
Kromrey, JD ;
Ferron, JM ;
Hines, CV .
PSYCHOMETRIKA, 2004, 69 (04) :593-611
[29]   MULTIVARIATE-COVARIANCE AND CANONICAL ANALYSIS - A METHOD FOR SELECTING MOST EFFECTIVE DISCRIMINATORS IN A MULTIVARIATE SITUATION [J].
HORTON, IF ;
RUSSELL, JS ;
MOORE, AW .
BIOMETRICS, 1968, 24 (04) :845-&
[30]   ISSUES IN THE USE AND INTERPRETATION OF DISCRIMINANT-ANALYSIS [J].
HUBERTY, CJ .
PSYCHOLOGICAL BULLETIN, 1984, 95 (01) :156-171