Efficient Hyperparameter Tuning with Grid Search for Text Categorization using kNN Approach with BM25 Similarity

被引:66
作者
Ghawi, Raji [1 ]
Pfeffer, Juergen [1 ]
机构
[1] Tech Univ Munich, Munich, Germany
来源
OPEN COMPUTER SCIENCE | 2019年 / 9卷 / 01期
关键词
hyperparameter tuning; text categorization; grid search; kNN; BM25; OPTIMIZATION;
D O I
10.1515/comp-2019-0011
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In machine learning, hyperparameter tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. Several approaches have been widely adopted for hyperparameter tuning, which is typically a time consuming process. We propose an efficient technique to speed up the process of hyperparameter tuning with Grid Search. We applied this technique on text categorization using kNN algorithm with BM25 similarity, where three hyperparameters need to be tuned. Our experiments show that our proposed technique is at least an order of magnitude faster than conventional tuning.
引用
收藏
页码:160 / 180
页数:21
相关论文
共 53 条
  • [41] Salton G., 1989, Automatic text processing
  • [42] Singhal A., 1996, SIGIR Forum, P21
  • [43] Snoek J., 2012, Adv Neural Inf Process Syst, V25
  • [44] Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms
    Thornton, Chris
    Hutter, Frank
    Hoos, Holger H.
    Leyton-Brown, Kevin
    [J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 847 - 855
  • [45] Wang X.-l, 2011, P JOINT ECML PKDD PA, V5
  • [46] Bayesian Optimization in a Billion Dimensions via Random Embeddings
    Wang, Ziyu
    Hutter, Frank
    Zoghi, Masrour
    Matheson, David
    de Freitas, Nando
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 55 : 361 - 387
  • [47] Wiener E, 1995, NEURAL NETWORK APPRO, P317
  • [48] 杨云峰, 1999, [西安公路交通大学学报, Journal of Xian Highway University], P67
  • [49] Yang Y., 2001, P 24 ANN INT ACM SIG, V01, P137, DOI 10.1145/383952.383975
  • [50] AN EXAMPLE-BASED MAPPING METHOD FOR TEXT CATEGORIZATION AND RETRIEVAL
    YANG, YM
    CHUTE, CG
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) : 252 - 277