Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results

被引:228
作者
Belete D.M. [1 ]
Huchaiah M.D. [1 ]
机构
[1] Department of Computer Science, Mangalore University, Mangalore
关键词
EDHS; Grid search; HIV/AIDS test result; Hyperparameter optimization; Machine learning; Prediction;
D O I
10.1080/1206212X.2021.1974663
中图分类号
学科分类号
摘要
In this work, we propose hyperparameters optimization using grid search to optimize the parameters of eight existing models and apply the best parameters to predict the outcomes of HIV tests from the Ethiopian Demographic and Health Survey (EDHS), HIV/AIDS dataset. The core challenge of this work is to find the right or optimum parameter values that generate the optimal model and uncertain training computing costs and test predictive models using various values of hyperparameters. To overcome these challenges, we explore the effects of hyperparameters optimizations by applying a proposed grid search hyperparameter optimization (GSHPO) on the considered models to robust the prediction power.  An extensive number of experiments are conducted to affirm the feasibility of our proposed methods. These experiments are done in two separate phases. In the first phase, we test our method with the selected models before hyperparameter optimization is applying (using the default parameters). The second phase of the experiment is done after the hyperparameter optimization is applying (using GSHPO). During the experiment, the 10-fold cross validation technique is used to solve the bias of the models. The proposed system helped to tune the hyperparameters using the grid search approach to the prediction algorithms. Several standard metrics are used to assess the method's efficiency, like accuracy, precision, recall, f1-score, AUC-ROC, MAE, RMSE, R2 and confusion matrix to compare results of each experiments. The results obtained by after applying 10-fold cross validation techniques and the proposed GSHPO are promising. Our findings suggest that the hyper-parameters of tuning models have a statistically important positive impact on the models' prediction accuracy. © 2021 Informa UK Limited, trading as Taylor & Francis Group.
引用
收藏
页码:875 / 886
页数:11
相关论文
共 31 条
[1]  
Thornton C., Hutter F., Hoos H., Et al., Auto-weka: combined selection and hyperparameter optimization of classification algorithms, Proc 19th ACM SIGKDD Int Conf Knowled Discov Data Min, pp. 847-855, (2013)
[2]  
DeCastro-Garcia N., Munoz Castaneda A.L., Escudero Garcia D., Et al., Effect of the sampling of a dataset in the hyperparameter optimization phase over the efficiency of a machine learning algorithm, Complexity, 1-16, (2019)
[3]  
Elshawi R., Maher M., Sakr S., Automated Machine Learning: State-Of-The-Art and Open Challenges, (2019)
[4]  
Bergstra J., Bengio Y., Random search for hyper-parameter optimization, J Mach Learn Res, 13, 1, pp. 281-305, (2012)
[5]  
Snoek J., Larochelle H., Adams R.P., Practical Bayesian Optimization of Machine Learning Algorithms, (2012)
[6]  
Chapelle O., Vapnik V., Bousquet O., Et al., Choosing multiple parameters for support vector machines, Mach Learn, 46, 1-3, pp. 131-159, (2002)
[7]  
Friedman J.H., Greedy function approximation: A gradient boosting machine, Ann Stat, 29, 5, pp. 1189-1232, (2001)
[8]  
Geurts P., Ernst D., Wehenkel L., Extremely randomized trees, Mach Learn, 63, 1, pp. 3-42, (2006)
[9]  
Zhang S., Li X., Zong M., Et al., Efficient knn classification with different numbers of nearest neighbors, IEEE Trans Neural Netw Learn Syst, 29, 5, pp. 1774-1785, (2017)
[10]  
Reif M., Shafait F., Dengel A., Prediction of classifier training time including parameter optimization, Annual Conf Artifi Intell Springer, 7006, pp. 260-271, (2011)