Efficient Hyperparameter Tuning with Grid Search for Text Categorization using kNN Approach with BM25 Similarity

被引:66
作者
Ghawi, Raji [1 ]
Pfeffer, Juergen [1 ]
机构
[1] Tech Univ Munich, Munich, Germany
来源
OPEN COMPUTER SCIENCE | 2019年 / 9卷 / 01期
关键词
hyperparameter tuning; text categorization; grid search; kNN; BM25; OPTIMIZATION;
D O I
10.1515/comp-2019-0011
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In machine learning, hyperparameter tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. Several approaches have been widely adopted for hyperparameter tuning, which is typically a time consuming process. We propose an efficient technique to speed up the process of hyperparameter tuning with Grid Search. We applied this technique on text categorization using kNN algorithm with BM25 similarity, where three hyperparameters need to be tuned. Our experiments show that our proposed technique is at least an order of magnitude faster than conventional tuning.
引用
收藏
页码:160 / 180
页数:21
相关论文
共 53 条
  • [1] Aggarwal CharuC., 2012, MINING TEXT DATA, DOI 10.1007/978-1-4614-3223-4_6
  • [2] Data mining by total ranking methods: A case study on optimisation of the "pulp and bleaching" process in the paper industry
    Pavan, M
    Todeschini, R
    Orlandi, M
    [J]. ANNALI DI CHIMICA, 2006, 96 (1-2) : 13 - 27
  • [3] [Anonymous], 2008, P 31 ANN INT ACM SIG, DOI 10.1145/1390334.1390356
  • [4] [Anonymous], 1998, EUR C MACH LEARN
  • [5] [Anonymous], 2011, P 24 ADV NEUR INF PR
  • [6] [Anonymous], 2001, Pacific-asia conference on knowledge discovery and data mining, DOI DOI 10.1007/3-540-45357-1_9
  • [7] [Anonymous], 2010, ABS10122599 CORR
  • [8] [Anonymous], 2015, ABS150202127 CORR
  • [9] Ben HE., 2003, Proceedings of the Twelfth International Conference on Information and Knowledge Management, CIKM '03, P10, DOI DOI 10.1145/956863.956867
  • [10] Gradient-based optimization of hyperparameters
    Bengio, Y
    [J]. NEURAL COMPUTATION, 2000, 12 (08) : 1889 - 1900