Realization of Random Forest for Real-Time Evaluation through Tree Framing

被引:23
作者
Buschjaeger, Sebastian [1 ]
Chen, Kuan-Hsun [2 ]
Chen, Jian-Jia [2 ]
Morik, Katharina [1 ]
机构
[1] TU Dortmund Univ, Artificial Intelligence Unit, Dortmund, Germany
[2] TU Dortmund Univ, Design Automat Embedded Syst Grp, Dortmund, Germany
来源
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2018年
关键词
random forest; decision trees; caching; computer architecture;
D O I
10.1109/ICDM.2018.00017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The optimization of learning has always been of particular concern for big data analytics. However, the ongoing integration of machine learning models into everyday life also demand the evaluation to be extremely fast and in real-time. Moreover, in the Internet of Things, the computing facilities that run the learned model are restricted. Hence, the implementation of the model application must take the characteristics of the executing platform into account Although there exist some heuristics that optimize the code, principled approaches for fast execution of learned models are rare. In this paper, we introduce a method that optimizes the execution of Decision Trees (DT). Decision Trees form the basis of many ensemble methods, such as Random Forests (RF) or Extremely Randomized Trees (ET). For these methods to work best, trees should be as large as possible. This challenges the data and the instruction cache of modern CPUs and thus demand a more careful memory layout. Based on a probabilistic view of decision tree execution, we optimize the two most common implementation schemes of decision trees. We discuss the advantages and disadvantages of both implementations and present a theoretically well-founded memory layout which maximizes locality during execution in both cases. The method is applied to three computer architectures, namely ARM (RISC), PPC (Extended RISC) and Intel (CISC) and is automatically adopted to the specific architecture by a code generator. We perform over 1800 experiments on several real-world data sets and report an average speed-up of 2 to 4 across all three architectures by using the proposed memory layout. Moreover, we find that our implementation outperforms sklearn, which was used to train the models by a factor of 1500.
引用
收藏
页码:19 / 28
页数:10
相关论文
共 34 条
[1]   Design and operation of FACT - the first G-APD Cherenkov telescope [J].
Anderhub, H. ;
Backes, M. ;
Biland, A. ;
Boccone, V. ;
Braun, I. ;
Bretz, T. ;
Buss, J. ;
Cadoux, F. ;
Commichau, V. ;
Djambazov, L. ;
Dorner, D. ;
Einecke, S. ;
Eisenacher, D. ;
Gendotti, A. ;
Grimm, O. ;
von Gunten, H. ;
Haller, C. ;
Hildebrand, D. ;
Horisberger, U. ;
Huber, B. ;
Kim, K. -S. ;
Knoetig, M. L. ;
Koehne, J. -H. ;
Kraehenbuehl, T. ;
Krumm, B. ;
Lee, M. ;
Lorenz, E. ;
Lustermann, W. ;
Lyard, E. ;
Mannheim, K. ;
Meharga, M. ;
Meier, K. ;
Montaruli, T. ;
Neise, D. ;
Nessi-Tedaldi, F. ;
Overkemping, A. -K. ;
Paravac, A. ;
Pauss, F. ;
Renker, D. ;
Rhode, W. ;
Ribordy, M. ;
Roeser, U. ;
Stucki, J. -P. ;
Schneider, J. ;
Steinbring, T. ;
Temme, F. ;
Thaele, J. ;
Tobler, S. ;
Viertel, G. ;
Vogler, P. .
JOURNAL OF INSTRUMENTATION, 2013, 8
[2]  
[Anonymous], 2014, Understanding Random Forests: From Theory to Practice
[3]   Runtime Optimizations for Tree-based Machine Learning Models [J].
Asadi, Nima ;
Lin, Jimmy ;
de Vries, Arjen P. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) :2281-2292
[4]   A random forest guided tour [J].
Biau, Gerard ;
Scornet, Erwan .
TEST, 2016, 25 (02) :197-227
[5]  
Biau G, 2012, J MACH LEARN RES, V13, P1063
[6]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]  
Breiman L., 1996, 460 U CAL DEP STAT
[10]  
Breiman Leo, 2000, Technical report 577