Letter to the Editor: On the term 'interaction' and related phrases in the literature on Random Forests

被引:49
作者
Boulesteix, Anne-Laure [1 ]
Janitza, Silke [2 ]
Hapfelmeier, Alexander [3 ]
Van Steen, Kristel [4 ]
Strobl, Carolin [5 ]
机构
[1] Univ Munich, Computat Mol Med, D-81377 Munich, Germany
[2] Univ Munich, D-81377 Munich, Germany
[3] Tech Univ Munich, Inst Med Stat & Epidemiol, D-80290 Munich, Germany
[4] Univ Liege, Inst Montefiore, B-4000 Liege, Belgium
[5] Univ Zurich, CH-8006 Zurich, Switzerland
关键词
random forest; statistics; interaction; correlation; conditional inference trees; conditional variable importance; VARIABLE IMPORTANCE MEASURES; CLASSIFICATION; PREDICTORS; REGRESSION;
D O I
10.1093/bib/bbu012
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In an interesting and quite exhaustive review on Random Forests (RF) methodology in bioinformatics Touw et al. address-among other topics-the problem of the detection of interactions between variables based on RF methodology. We feel that some important statistical concepts, such as 'interaction', 'conditional dependence' or 'correlation', are sometimes employed inconsistently in the bioinformatics literature in general and in the literature on RF in particular. In this letter to the Editor, we aim to clarify some of the central statistical concepts and point out some confusing interpretations concerning RF given by Touw et al. and other authors.
引用
收藏
页码:338 / 345
页数:8
相关论文
共 32 条
[11]  
Fisher R. A., 1919, Transactions of the Royal Society of Edinburgh, V52
[12]  
Friedman J., 2001, ELEMENTS STAT LEARNI, VVolume 1, DOI 10.1007/978-0-387-84858-7
[13]  
Grobbee D.E. Hoes., 2009, Clinical Epidemiology: Principles, Methods and Applications for Clinical Research
[14]   Variable Importance Assessment in Regression: Linear Regression versus Random Forest [J].
Groemping, Ulrike .
AMERICAN STATISTICIAN, 2009, 63 (04) :308-319
[15]   Unbiased recursive partitioning: A conditional inference framework [J].
Hothorn, Torsten ;
Hornik, Kurt ;
Zeileis, Achim .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2006, 15 (03) :651-674
[16]  
Kelly C, 2012, 2012 9TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), P154, DOI 10.1109/ISBI.2012.6235507
[17]  
Kim Yoonhee, 2009, BMC Proc, V3 Suppl 7, pS64
[18]  
Kim Yoonhee, 2007, Genomics & Informatics, V5, P168
[19]  
Liaw A., 2012, Breiman and Cutlers random forests for classification and regression, R package
[20]   Capturing the Spectrum of Interaction Effects in Genetic Association Studies by Simulated Evaporative Cooling Network Analysis [J].
McKinney, Brett A. ;
Crowe, James E., Jr. ;
Guo, Jingyu ;
Tian, Dehua .
PLOS GENETICS, 2009, 5 (03)