共 50 条
Where are the large and difficult datasets?
被引:0
作者:
Adrien Jamain
David J. Hand
机构:
[1] BNP-Paribas,Department of Mathematics
[2] Institute for Mathematical Sciences,undefined
来源:
Advances in Data Analysis and Classification
|
2009年
/
3卷
关键词:
Error rate;
Meta-analysis;
Comparative studies;
Repositories;
6207;
68T10;
D O I:
暂无
中图分类号:
学科分类号:
摘要:
A great many comparative performance assessments of classification rules have been undertaken, ranging from small ones involving just one or two methods, to large ones involving many tens of methods. We are undertaking a meta-analytic study of these studies, attempting to distil some overall conclusions. This paper describes just one of our observations. The dataset analysed in this paper contains 5,203 error rates taken from 45 articles and describing 146 datasets. One curious general relationship which was persistent in our data, despite the fact that we were looking at results mixed between distributions rather than conditional on distributions, was that error rate decreased with increasing dataset size. We believe this to be an artefact of the way datasets are collected by the research community.
引用
收藏
页码:25 / 38
页数:13
相关论文