Towards evidence-based computational statistics: lessons from clinical research on the role and design of real-data benchmark studies

被引:52
作者
Boulesteix, Anne-Laure [1 ]
Wilson, Rory [1 ]
Hapfelmeier, Alexander [2 ]
机构
[1] Ludwig Maximilians Univ Munchen, Inst Med Informat Proc Biometry & Epidemiol, Marchioninistr 15, D-81377 Munich, Germany
[2] Tech Univ Munich, Inst Med Stat & Epidemiol, Ismaninger Str 22, D-81675 Munich, Germany
关键词
Method evaluation; Good practice; Comparison study; Clinical trial; CLASSIFICATION METHODS; BIOINFORMATICS; CLASSIFIERS; BIAS;
D O I
10.1186/s12874-017-0417-2
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: The goal of medical research is to develop interventions that are in some sense superior, with respect to patient outcome, to interventions currently in use. Similarly, the goal of research in methodological computational statistics is to develop data analysis tools that are themselves superior to the existing tools. The methodology of the evaluation of medical interventions continues to be discussed extensively in the literature and it is now well accepted that medicine should be at least partly "evidence-based". Although we statisticians are convinced of the importance of unbiased, well-thought-out study designs and evidence-based approaches in the context of clinical research, we tend to ignore these principles when designing our own studies for evaluating statistical methods in the context of our methodological research. Main message: In this paper, we draw an analogy between clinical trials and real-data-based benchmarking experiments in methodological statistical science, with datasets playing the role of patients and methods playing the role of medical interventions. Through this analogy, we suggest directions for improvement in the design and interpretation of studies which use real data to evaluate statistical methods, in particular with respect to dataset inclusion criteria and the reduction of various forms of bias. More generally, we discuss the concept of "evidence-based" statistical research, its limitations and its impact on the design and interpretation of real-data-based benchmark experiments. Conclusion: We suggest that benchmark studies-a method of assessment of statistical methods using real-world datasets-might benefit from adopting (some) concepts from evidence-based medicine towards the goal of more evidence-based statistical research.
引用
收藏
页数:12
相关论文
共 36 条
[1]   Comparison between splines and fractional polynomials for multivariable model building with continuous covariates: a simulation study with continuous response [J].
Binder, Harald ;
Sauerbrei, Willi ;
Royston, Patrick .
STATISTICS IN MEDICINE, 2013, 32 (13) :2262-2277
[2]   Benchmarking local classification methods [J].
Bischl, Bernd ;
Schiffner, Julia ;
Weihs, Claus .
COMPUTATIONAL STATISTICS, 2013, 28 (06) :2599-2619
[3]  
Boulesteix AL, 2008, CANCER INFORM, V6, P77
[4]  
Boulesteix A.-L., 2017, BERECHENBARKEIT WELT, P155
[5]   A Statistical Framework for Hypothesis Testing in Real Data Comparison Studies [J].
Boulesteix, Anne-Laure ;
Hable, Robert ;
Lauer, Sabine ;
Eugster, Manuel J. A. .
AMERICAN STATISTICIAN, 2015, 69 (03) :201-212
[6]   Publication Bias in Methodological Computational Research [J].
Boulesteix, Anne-Laure ;
Stierle, Veronika ;
Hapfelmeier, Alexander .
CANCER INFORMATICS, 2015, 14 :11-19
[7]   On representative and illustrative comparisons with real data in bioinformatics: response to the letter to the editor by Smith et al. [J].
Boulesteix, Anne-Laure .
BIOINFORMATICS, 2013, 29 (20) :2664-2666
[8]   A Plea for Neutral Comparison Studies in Computational Sciences [J].
Boulesteix, Anne-Laure ;
Lauer, Sabine ;
Eugster, Manuel J. A. .
PLOS ONE, 2013, 8 (04)
[9]   Registered Reports: A new publishing initiative at Cortex [J].
Chambers, Christopher D. .
CORTEX, 2013, 49 (03) :609-610
[10]  
Couronne R, 2017, 205 LMU MUN DEP STAT