Mining Tree-Structured Data on Multicore Systems

被引:0
作者
Tatikonda, Shirish [1 ]
Parthasarathy, Srinivasan [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2009年 / 2卷 / 01期
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mining frequent subtrees in a database of rooted and labeled trees is an important problem in many domains, ranging from phylogenetic analysis to biochemistry and from linguistic parsing to XML data analysis. In this work we revisit this problem and develop an architecture conscious solution targeting emerging multicore systems. Specifically we identify a sequence of memory related optimizations that significantly improve the spatial and temporal locality of a state-of-the-art sequential algorithm - alleviating the effects of memory latency. Additionally, these optimizations are shown to reduce the pressure on the front-side bus, an important consideration in the context of large-scale multicore architectures. We then demonstrate that these optimizations while necessary are not sufficient for efficient parallelization on multicores, primarily due to parametric and data-driven factors which make load balancing a significant challenge. To address this challenge, we present a methodology that adaptively and automatically modulates the type and granularity of the work being shared among different cores. The resulting algorithm achieves near perfect parallel efficiency on up to 16 processors on challenging real world applications. The optimizations we present have general purpose utility and a key out-come is the development of a general purpose scheduling service for moldable task scheduling on emerging multicore systems.
引用
收藏
页数:12
相关论文
共 47 条
[1]   CODE GENERATION USING TREE MATCHING AND DYNAMIC-PROGRAMMING [J].
AHO, AV ;
GANAPATHI, M ;
TJIANG, SWK .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1989, 11 (04) :491-516
[2]  
Aoki Kiyoko F, 2003, Genome Inform, V14, P134
[3]  
Asai T, 2002, SIAM PROC S, P158
[4]   Clone detection using abstract syntax trees [J].
Baxter, ID ;
Yahin, A ;
Moura, L ;
Sant'Anna, M ;
Bier, L .
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, :368-377
[5]  
Berndt D. J., 1996, ADV KNOWLEDGE DISCOV, P229
[6]  
Buehrer G, 2006, IEEE DATA MINING, P97
[7]  
Buehrer Gregory, 2006, P 12 ACM SIGKDD INT, P86
[8]  
Charniak E, 1996, PROCEEDINGS OF THE THIRTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE, VOLS 1 AND 2, P1031
[9]  
Chi Y, 2005, FUND INFORM, V66, P161
[10]  
Chi Y, 2004, LECT NOTES ARTIF INT, V3056, P63