Realization of Random Forest for Real-Time Evaluation through Tree Framing

被引:22
作者
Buschjaeger, Sebastian [1 ]
Chen, Kuan-Hsun [2 ]
Chen, Jian-Jia [2 ]
Morik, Katharina [1 ]
机构
[1] TU Dortmund Univ, Artificial Intelligence Unit, Dortmund, Germany
[2] TU Dortmund Univ, Design Automat Embedded Syst Grp, Dortmund, Germany
来源
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2018年
关键词
random forest; decision trees; caching; computer architecture;
D O I
10.1109/ICDM.2018.00017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The optimization of learning has always been of particular concern for big data analytics. However, the ongoing integration of machine learning models into everyday life also demand the evaluation to be extremely fast and in real-time. Moreover, in the Internet of Things, the computing facilities that run the learned model are restricted. Hence, the implementation of the model application must take the characteristics of the executing platform into account Although there exist some heuristics that optimize the code, principled approaches for fast execution of learned models are rare. In this paper, we introduce a method that optimizes the execution of Decision Trees (DT). Decision Trees form the basis of many ensemble methods, such as Random Forests (RF) or Extremely Randomized Trees (ET). For these methods to work best, trees should be as large as possible. This challenges the data and the instruction cache of modern CPUs and thus demand a more careful memory layout. Based on a probabilistic view of decision tree execution, we optimize the two most common implementation schemes of decision trees. We discuss the advantages and disadvantages of both implementations and present a theoretically well-founded memory layout which maximizes locality during execution in both cases. The method is applied to three computer architectures, namely ARM (RISC), PPC (Extended RISC) and Intel (CISC) and is automatically adopted to the specific architecture by a code generator. We perform over 1800 experiments on several real-world data sets and report an average speed-up of 2 to 4 across all three architectures by using the proposed memory layout. Moreover, we find that our implementation outperforms sklearn, which was used to train the models by a factor of 1500.
引用
收藏
页码:19 / 28
页数:10
相关论文
共 50 条
  • [21] Real-Time Usage Forecasting for Bike-Sharing Systems A Study on Random Forest and Convolutional Neural Network Applicability
    Ruffieux, Simon
    Spycher, Nicolas
    Mugellini, Elena
    Abou Khaled, Omar
    PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2017, : 622 - 631
  • [22] Real-time Hand Finger Motion Capturing using Regression Forest
    Hsieh, Pei-Chi
    Hsu, Shih-Chung
    Huang, Chung-Lin
    2016 INTERNATIONAL COMPUTER SYMPOSIUM (ICS), 2016, : 419 - 424
  • [23] Decision Tree Algorithm for Real-Time Identification of Critical Voltage Control Areas
    Lis, Robert Andrzej
    ELECTRONICS, MECHATRONICS AND AUTOMATION III, 2014, 666 : 132 - 137
  • [24] Robust near real-time estimation of physiological parameters from megapixel multispectral images with inverse Monte Carlo and random forest regression
    Sebastian J. Wirkert
    Hannes Kenngott
    Benjamin Mayer
    Patrick Mietkowski
    Martin Wagner
    Peter Sauer
    Neil T. Clancy
    Daniel S. Elson
    Lena Maier-Hein
    International Journal of Computer Assisted Radiology and Surgery, 2016, 11 : 909 - 917
  • [25] Robust near real-time estimation of physiological parameters from megapixel multispectral images with inverse Monte Carlo and random forest regression
    Wirkert, Sebastian J.
    Kenngott, Hannes
    Mayer, Benjamin
    Mietkowski, Patrick
    Wagner, Martin
    Sauer, Peter
    Clancy, Neil T.
    Elson, Daniel S.
    Maier-Hein, Lena
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2016, 11 (06) : 909 - 917
  • [26] A Real Time Patient Monitoring System for Heart Disease Prediction Using Random Forest Algorithm
    Sreejith, S.
    Rahul, S.
    Jisha, R. C.
    ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS (SIRS-2015), 2016, 425 : 485 - 500
  • [27] Evaluation of random forest and regression tree methods for estimation of mass first flush ratio in urban catchments
    Jeung, Minhyuk
    Baek, Sangsoo
    Beom, Jina
    Cho, Kyung Hwa
    Her, Younggu
    Yoon, Kwangsik
    JOURNAL OF HYDROLOGY, 2019, 575 : 1099 - 1110
  • [28] Real-Time Lossy Audio Signal Reconstruction Using Novel Sliding Based Multi-instance Linear Regression/Random Forest and Enhanced CGPANN
    Nadia Masood Khan
    Gul Muhammad Khan
    Neural Processing Letters, 2021, 53 : 227 - 255
  • [29] Real-Time Lossy Audio Signal Reconstruction Using Novel Sliding Based Multi-instance Linear Regression/Random Forest and Enhanced CGPANN
    Khan, Nadia Masood
    Khan, Gul Muhammad
    NEURAL PROCESSING LETTERS, 2021, 53 (01) : 227 - 255
  • [30] Evaluation of Adaptive Partitioning and Real-Time Capability for Virtualization With Xen Hypervisor
    Schulz, Bernd
    Annighoefer, Bjorn
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2022, 58 (01) : 206 - 217