Five myths about variable selection

被引:394
作者
Heinze, Georg [1 ]
Dunkler, Daniela [1 ]
机构
[1] Med Univ Vienna, Ctr Med Stat Informat & Intelligent Syst, Sect Clin Biometr, Spitalgasse 23, A-1090 Vienna, Austria
关键词
association; explanatory models; multivariable modeling; prediction; statistical analysis; LIVER-TRANSPLANTATION; SURVIVAL; RECIPIENTS; EVENTS; MODEL;
D O I
10.1111/tri.12895
中图分类号
R61 [外科手术学];
学科分类号
摘要
Multivariable regression models are often used in transplantation research to identify or to confirm baseline variables which have an independent association, causally or only evidenced by statistical correlation, with transplantation outcome. Although sound theory is lacking, variable selection is a popular statistical method which seemingly reduces the complexity of such models. However, in fact, variable selection often complicates analysis as it invalidates common tools of statistical inference such as P-values and confidence intervals. This is a particular problem in transplantation research where sample sizes are often only small to moderate. Furthermore, variable selection requires computer-intensive stability investigations and a particularly cautious interpretation of results. We discuss how five common misconceptions often lead to inappropriate application of variable selection. We emphasize that variable selection and all problems related with it can often be avoided by the use of expert knowledge.
引用
收藏
页码:6 / 10
页数:5
相关论文
共 29 条
[1]   Portal vein encasement predicts neoadjuvant therapy response in liver transplantation for perihilar cholangiocarcinoma protocol [J].
Bhat, Mamatha ;
Hathcock, Matthew ;
Kremers, Walter K. ;
Murad, Sarwa Darwish ;
Schmit, Grant ;
Martenson, James ;
Alberts, Steven ;
Rosen, Charles B. ;
Gores, Gregory J. ;
Heimbach, Julie .
TRANSPLANT INTERNATIONAL, 2015, 28 (12) :1383-1391
[2]  
Breiman L, 1998, ANN STAT, V26, P801
[3]  
Burnham K. P., 2002, A practical information-theoretic approach: model selection and multimodel inference
[4]   Augmented Backward Elimination: A Pragmatic and Purposeful Way to Develop Statistical Models [J].
Dunkler, Daniela ;
Plischke, Max ;
Leffondre, Karen ;
Heinze, Georg .
PLOS ONE, 2014, 9 (11)
[5]   Cytomegalovirus prevention strategies in seropositive kidney transplant recipients: an insight into current clinical practice [J].
Fernandez-Ruiz, Mario ;
Arias, Manuel ;
Campistol, Josep M. ;
Navarro, David ;
Gomez-Huertas, Ernesto ;
Gomez-Marquez, Gonzalo ;
Manuel Diaz, Juan ;
Hernandez, Domingo ;
Bernal-Blanco, Gabriel ;
Cofan, Frederic ;
Jimeno, Luisa ;
Franco-Esteve, Antonio ;
Gonzalez, Esther ;
Moreso, Francesc J. ;
Gomez-Alamillo, Carlos ;
Mendiluce, Alicia ;
Luna-Huerta, Enrique ;
Maria Aguado, Jose .
TRANSPLANT INTERNATIONAL, 2015, 28 (09) :1042-1054
[6]   A dirty dozen:: Twelve P-value misconceptions [J].
Goodman, Steven .
SEMINARS IN HEMATOLOGY, 2008, 45 (03) :135-140
[7]   Causal diagrams for epidemiologic research [J].
Greenland, S ;
Pearl, J ;
Robins, JM .
EPIDEMIOLOGY, 1999, 10 (01) :37-48
[8]  
Harrell FE, 2015, SPRINGER SER STAT, DOI 10.1007/978-3-319-19425-7
[9]  
IBM Corp, 2013, IBM STAT WIND
[10]   Donor/recipient sex mismatch and survival after heart transplantation: only an issue in male recipients? An analysis of the Spanish Heart Transplantation Registry [J].
Martinez-Selles, Manuel ;
Almenar, Luis ;
Paniagua-Martin, Maria J. ;
Segovia, Javier ;
Delgado, Juan F. ;
Arizon, Jose M. ;
Ayesta, Ana ;
Lage, Ernesto ;
Brossa, Vicens ;
Manito, Nicolas ;
Perez-Villa, Felix ;
Diaz-Molina, Beatriz ;
Rabago, Gregorio ;
Blasco-Peiro, Teresa ;
De La Fuente Galan, Luis ;
Pascual-Figal, Domingo ;
Gonzalez-Vilchez, Francisco .
TRANSPLANT INTERNATIONAL, 2015, 28 (03) :305-313