Sample size determination: posterior distributions proximity

被引:0
作者
Kiselev, Nikita [1 ]
Grabovoy, Andrey [1 ]
机构
[1] Moscow Inst Phys & Technol, Dolgoprudnyi, Russia
关键词
Sufficient sample size; Posterior distributions proximity; Normal posterior distribution; Linear regression; POWER; SELECTION; TESTS;
D O I
10.1007/s10287-024-00528-9
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
The issue of sample size determination is crucial for constructing an effective machine learning model. However, the existing methods for determining a sufficient sample size are either not strictly proven, or relate to the specific statistical hypothesis about the distribution of model parameters. In this paper we present two approaches based on the proximity of posterior distributions of model parameters on similar subsamples. We show that these two methods are valid for the model with normal posterior distribution of parameters. Computational experiments demonstrate the convergence of the proposed functions as the sample size increases. We also compare the proposed methods with other approaches on different datasets.
引用
收藏
页数:16
相关论文
共 20 条
[1]   A BAYESIAN-APPROACH TO CALCULATING SAMPLE SIZES [J].
ADCOCK, CJ .
STATISTICIAN, 1988, 37 (4-5) :433-439
[2]  
Aduenko A, 2017, Selection of multimodels in classification tasks
[3]   Sample-Size Determination Methodologies for Machine Learning in Medical Imaging Research: A Systematic Review [J].
Balki, Indranil ;
Amirabadi, Afsaneh ;
Levman, Jacob ;
Martel, Anne L. ;
Emersic, Ziga ;
Meden, Blaz ;
Garcia-Pedrero, Angel ;
Ramirez, Saul C. ;
Kong, Dehan ;
Moody, Alan R. ;
Tyrrell, Pascal N. .
CANADIAN ASSOCIATION OF RADIOLOGISTS JOURNAL-JOURNAL DE L ASSOCIATION CANADIENNE DES RADIOLOGISTES, 2019, 70 (04) :344-353
[4]   A genetic algorithm-based, hybrid machine learning approach to model selection [J].
Bies, RR ;
Muldoon, MF ;
Pollock, BG ;
Manuck, S ;
Smith, G ;
Sale, ME .
JOURNAL OF PHARMACOKINETICS AND PHARMACODYNAMICS, 2006, 33 (02) :195-221
[5]   Bayesian-frequentist sample size determination: A game of two priors [J].
Brutti P. ;
De Santis F. ;
Gubbiotti S. .
METRON, 2014, 72 (2) :133-151
[6]   Sample size selection in optimization methods for machine learning [J].
Byrd, Richard H. ;
Chin, Gillian M. ;
Nocedal, Jorge ;
Wu, Yuchen .
MATHEMATICAL PROGRAMMING, 2012, 134 (01) :127-155
[7]   Comparison of Bayesian sample size criteria: ACC, ALC, and WOC [J].
Cao, Jing ;
Lee, J. Jack ;
Alber, Susan .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (12) :4111-4122
[8]  
Cawley GC, 2010, J MACH LEARN RES, V11, P2079
[9]   Predicting sample size required for classification performance [J].
Figueroa, Rosa L. ;
Zeng-Treitler, Qing ;
Kandula, Sasikiran ;
Ngo, Long H. .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12
[10]   Numerical Methods of Sufficient Sample Size Estimation for Generalised Linear Models [J].
Grabovoy, A. V. ;
Gadaev, T. T. ;
Motrenko, A. P. ;
Strijov, V. V. .
LOBACHEVSKII JOURNAL OF MATHEMATICS, 2022, 43 (09) :2453-2462