Runtime and memory consumption analyses for machine learning R programs

被引:11
作者
Kotthaus, Helena [1 ]
Korb, Ingo [1 ]
Lang, Michel [2 ]
Bischl, Bernd [2 ]
Rahnenfuehrer, Joerg [2 ]
Marwedel, Peter [1 ]
机构
[1] TU Dortmund Univ, Dept Comp Sci 12, D-44227 Dortmund, Germany
[2] TU Dortmund Univ, Dept Stat, D-44227 Dortmund, Germany
关键词
machine learning; performance analyses; profiling; classification algorithms;
D O I
10.1080/00949655.2014.925192
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
R is a multi-paradigm language with a dynamic type system, different object systems and functional characteristics. These characteristics support the development of statistical algorithms at a high level of abstraction. Although R is commonly used in the statistics domain a big disadvantage are its runtime problems when handling computation-intensive algorithms. Especially in the domain of machine learning the execution of pure R programs is often unacceptably slow. Our long-term goal is to resolve these issues and in this contribution we used the traceR tool to analyse the bottlenecks arising in this domain. Here we measured the runtime and overall memory consumption on a well-defined set of classical machine learning applications and gained detailed insights into the performance issues of these programs.
引用
收藏
页码:14 / 29
页数:16
相关论文
共 6 条
  • [1] [Anonymous], MODERN APPL STAT S
  • [2] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [3] Unbiased recursive partitioning: A conditional inference framework
    Hothorn, Torsten
    Hornik, Kurt
    Zeileis, Achim
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2006, 15 (03) : 651 - 674
  • [4] Karatzoglou A., 2004, J.stat. softw, V11, P1, DOI [10.18637/jss.v011.i09, DOI 10.18637/JSS.V011.I09]
  • [5] Tierney Luke., 1990, LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics
  • [6] Weihs C, 2005, ST CLASS DAT ANAL, P335