Kernel-based machine learning for fast text mining in R

被引:24
作者
Karatzoglou, Alexandros [1 ]
Feinerer, Ingo [2 ]
机构
[1] INSA Rouen, LITIS, F-76801 St Etienne, France
[2] Vienna Univ Technol, Inst Informat Syst, Database & Artificial Intelligence Grp, Vienna, Austria
关键词
D O I
10.1016/j.csda.2009.09.023
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recent advances in the field of kernel-based machine learning methods allow fast processing of text using string kernels utilizing suffix arrays. kernlab provides both kernel methods' infrastructure and a large collection of already implemented algorithms and includes an implementation of suffix-array-based string kernels. Along with the use of the text mining infrastructure provided by tm these packages provide R with functionality in processing, visualizing and grouping large collections of text data using kernel methods. The emphasis is on the performance of various types of string kernels at these tasks. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:290 / 297
页数:8
相关论文
共 21 条
  • [1] Abouelhoda M. I., 2004, Journal of Discrete Algorithms, V2, P53, DOI 10.1016/S1570-8667(03)00065-0
  • [2] [Anonymous], 2004, Kernel methods in computational biology
  • [3] Word-sequence kernels
    Cancedda, N
    Gaussier, E
    Goutte, C
    Renders, JM
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) : 1059 - 1082
  • [4] Feinerer I, 2008, J STAT SOFTW, V25, P1
  • [5] Feinerer Ingo., 2008, tm: Text mining package
  • [6] Herbrich R., 2002, Learning Kernel Classifiers: Theory and Algorithms, Adaptive Computation and Machine Learning
  • [7] Karama M, 2004, SCI ENG COMPOS MATER, V11, P1
  • [8] Text clustering with string kernels in R
    Karatzoglou, Alexandros
    Feinerer, Ingo
    [J]. ADVANCES IN DATA ANALYSIS, 2007, : 91 - +
  • [9] LESLIE CS, 2002, ADV NEURAL INFORM PR, V15, P1417
  • [10] Lewis DD, 1997, REUTERS 21578 TEXT C