Realization of Random Forest for Real-Time Evaluation through Tree Framing

被引:22
作者
Buschjaeger, Sebastian [1 ]
Chen, Kuan-Hsun [2 ]
Chen, Jian-Jia [2 ]
Morik, Katharina [1 ]
机构
[1] TU Dortmund Univ, Artificial Intelligence Unit, Dortmund, Germany
[2] TU Dortmund Univ, Design Automat Embedded Syst Grp, Dortmund, Germany
来源
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2018年
关键词
random forest; decision trees; caching; computer architecture;
D O I
10.1109/ICDM.2018.00017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The optimization of learning has always been of particular concern for big data analytics. However, the ongoing integration of machine learning models into everyday life also demand the evaluation to be extremely fast and in real-time. Moreover, in the Internet of Things, the computing facilities that run the learned model are restricted. Hence, the implementation of the model application must take the characteristics of the executing platform into account Although there exist some heuristics that optimize the code, principled approaches for fast execution of learned models are rare. In this paper, we introduce a method that optimizes the execution of Decision Trees (DT). Decision Trees form the basis of many ensemble methods, such as Random Forests (RF) or Extremely Randomized Trees (ET). For these methods to work best, trees should be as large as possible. This challenges the data and the instruction cache of modern CPUs and thus demand a more careful memory layout. Based on a probabilistic view of decision tree execution, we optimize the two most common implementation schemes of decision trees. We discuss the advantages and disadvantages of both implementations and present a theoretically well-founded memory layout which maximizes locality during execution in both cases. The method is applied to three computer architectures, namely ARM (RISC), PPC (Extended RISC) and Intel (CISC) and is automatically adopted to the specific architecture by a code generator. We perform over 1800 experiments on several real-world data sets and report an average speed-up of 2 to 4 across all three architectures by using the proposed memory layout. Moreover, we find that our implementation outperforms sklearn, which was used to train the models by a factor of 1500.
引用
收藏
页码:19 / 28
页数:10
相关论文
共 50 条
  • [31] Performance Evaluation of the GIS-Based Data-Mining Techniques Decision Tree, Random Forest, and Rotation Forest for Landslide Susceptibility Modeling
    Park, Soyoung
    Hamm, Se-Yeong
    Kim, Jinsoo
    SUSTAINABILITY, 2019, 11 (20)
  • [32] Real-time moisture ratio study of drying date fruit chips based on on-line image attributes using kNN and random forest regression methods
    Keramat-Jahromi, Mahdi
    Mohtasebi, Seyed Saeid
    Mousazadeh, Hossein
    Ghasemi-Varnamkhasti, Mahdi
    Rahimi-Movassagh, Maryam
    MEASUREMENT, 2021, 172
  • [33] LoRa Based Metrics Evaluation for Real-Time Landslide Monitoring on IoT Platform
    Bagwari, Swapnil
    Roy, Ajay
    Gehlot, Anita
    Singh, Rajesh
    Priyadarshi, Neeraj
    Khan, Baseem
    IEEE ACCESS, 2022, 10 : 46392 - 46407
  • [34] Study on dynamic evaluation of vibration quality of concrete dam based on real-time monitoring
    Zhong D.
    Shen Z.
    Wang J.
    Cui B.
    Ren B.
    Wang D.
    Shuili Xuebao/Journal of Hydraulic Engineering, 2018, 49 (07): : 775 - 786
  • [35] Decision Tree Variations and Online Tuning for Real-Time Control of a Building in a Two-Stage Management Strategy
    Rigo-Mariani, Remy
    Yakub, Alim
    ENERGIES, 2024, 17 (11)
  • [36] A novel variable selection method based on frequent pattern tree for real-time traffic accident risk prediction
    Lin, Lei
    Wang, Qian
    Sadek, Adel W.
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2015, 55 : 444 - 459
  • [37] Performance evaluation of random forest and boosted tree in rainfall-runoff process modeling for sub-basins of Lake Urmia
    Bigdeli, Zeinab
    Majnooni-heris, Abolfazl
    Delirhasannia, Reza
    Karimi, Sepideh
    ATMOSFERA, 2025, 39 : 143 - 167
  • [38] Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naive Bayes tree for landslide susceptibility modeling
    Chen, Wei
    Zhang, Shuai
    Li, Renwei
    Shahabi, Himan
    SCIENCE OF THE TOTAL ENVIRONMENT, 2018, 644 : 1006 - 1018
  • [39] Real-Time State Evaluation System of Antenna Structures in Radio Telescopes Based on a Digital Twin
    Cui, Hanwei
    Xiang, Binbin
    Mo, Shike
    Wang, Wei
    Lin, Shangmin
    Lian, Peiyuan
    Wang, Wei
    Wang, Congsi
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [40] Important Variables Identification and Proactive Evaluation of Real-time Ship Traffic Sailing Risk in Waterway
    Zhang, Shukui
    Tao, Si
    Ding, Zhenguo
    PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1445 - 1449