Lookahead and piloting strategies for variable selection

被引:0
作者
Zhang, Junni L. [1 ]
Lin, Ming T.
Liu, Jun S.
Chen, Rong
机构
[1] Peking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R China
[2] Univ Illinois, Coll Business Adm, Dept Informat & Decis Sci MC 294, Chicago, IL 60607 USA
[3] Harvard Univ, Dept Stat, Cambridge, MA 02138 USA
关键词
AIC; Akaike information criterion; BIC; Bayesian information criterion; gene regulation; Gibbs sampler; microarray data; sequential Monte Carlo; TFBM; transcription factor binding-site motif;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The traditional variable selection problem has attracted renewed attention from statistical researchers due to the recent advances in data collection, especially in fields such as bioinformatics and marketing. In this paper, we formulate regression variable selection as an optimization problem, propose and study several deterministic and stochastic sequential optimization methods with lookahead. Using several synthetic examples, we show that the stochastic sequential method with lookahead robustly and significantly outperforms a few close competitors, including the popular stepwise methods. When applied to analyze a yeast amino acid starvation microarray experiment, this method can find many transcription factors that are known to be important for yeast to cope with stress and starvation.
引用
收藏
页码:985 / 1003
页数:19
相关论文
共 32 条
[1]   LIKELIHOOD OF A MODEL AND INFORMATION CRITERIA [J].
AKAIKE, H .
JOURNAL OF ECONOMETRICS, 1981, 16 (01) :3-14
[2]   The intrinsic Bayes factor for model selection and prediction [J].
Berger, JO ;
Pericchi, LR .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (433) :109-122
[3]  
BIAO X, 2004, U CALIFORNIA BERKELE
[4]   Met31p and Met32p, two related zinc finger proteins, are involved in transcriptional regulation of yeast sulfur amino acid metabolism [J].
Blaiseau, PL ;
Isnard, AD ;
SurdinKerjan, Y ;
Thomas, D .
MOLECULAR AND CELLULAR BIOLOGY, 1997, 17 (07) :3640-3648
[5]  
CARLIN BP, 1995, J ROY STAT SOC B MET, V57, P473
[6]   Adaptive joint detection and decoding in flat-fading channels via mixture Kalman filtering [J].
Chen, R ;
Wang, XD ;
Liu, JS .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2000, 46 (06) :2079-2094
[7]   Integrating regulatory motif discovery and genome-wide expression analysis [J].
Conlon, EM ;
Liu, XS ;
Lieb, JD ;
Liu, JS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (06) :3339-3344
[8]  
Effroymson M.A., 1960, MATH METHODS DIGITAL
[9]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[10]   Benchmark priors for Bayesian model averaging [J].
Fernández, C ;
Ley, E ;
Steel, MFJ .
JOURNAL OF ECONOMETRICS, 2001, 100 (02) :381-427