Exploiting Parallelism and Vectorisation in Breadth-First Search for the Intel Xeon Phi

被引:2
作者
Paredes, Mireya [1 ]
Riley, Graham [1 ]
Lujan, Mikel [1 ]
机构
[1] Univ Manchester, Sch Comp Sci, Kilburn Bldg,Oxford Rd, Manchester M13 9PL, Lancs, England
基金
英国工程与自然科学研究理事会;
关键词
Breadth-first search; graph algorithms; hybrid BFS; vectorisation; parallel architecture; Graph; 500; Xeon Phi;
D O I
10.1109/TPDS.2019.2927451
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Modern applications generate massive amounts of data that is challenging to process or analyse. Graph algorithms have emerged as a solution for the analysis of such data because they can represent the entities participating in the generation of large-scale datasets in terms of vertices and their relationships in terms of edges. Graph analysis algorithms are used for finding patterns within these relationships, aiming to extract information to be further analysed. The breadth-first search (BFS) is one of the main graph search algorithms used for graph analysis and its optimisation has been widely researched using different parallel computers. However, the parallelisation of BFS has been shown to be challenging because of its inherent characteristics, including irregular memory access patterns, data dependencies and workload imbalance, that limit its scalability. This paper investigates the optimisation of the BFS on the Xeon Phi (Knights Corner), a modern parallel architecture provided with an advanced vector processor supporting the AVX-512 instruction set, using a bespoke development framework integrated with the Graph 500 benchmark. In addition, to demonstrate portability, we show results for a direct port of the algorithms to a more recent version of the Xeon Phi (Knights Landing) and to a Skylake CPU which supports most of the AVX-512 instruction set. Optimised parallel versions of two high-level algorithms for BFS were created using vectorisation, starting with the conventional top-down BFS algorithm and, building on this, a hybrid BFS algorithm. On the KNC our best implementations result in speedups of 1.37x (top-down) and 1.37x (hybrid), for a one million vertices graph, compared to the state-of-the-art. On the KNL and Skylake, the performance is higher than on KNC. In addition, we show results of our best hybrid algorithm on real-world graphs from the SNAP datasets with speedups up to 1.3x on KNC. Performance on KNL and Skylake is again higher, demonstrating the robustness and portability of our algorithm. The hybrid BFS algorithm can be further used to speed up other graph analysis algorithms and the lessons learned from vectorisation can be applied to other algorithms targeting existing and future models of the Xeon Phi and other advanced vector architectures.
引用
收藏
页码:111 / 128
页数:18
相关论文
共 50 条
  • [21] Exploiting Very-Wide Vectors on Intel Xeon Phi with Lattice-QCD kernels
    Diavastos, Andreas
    Stylianou, Giannos
    Koutsou, Giannis
    2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP), 2016, : 296 - 300
  • [22] Unrestricted and complete Breadth-First Search of trapezoid graphs in O(n) time
    Crespelle, Christophe
    Gambette, Philippe
    INFORMATION PROCESSING LETTERS, 2010, 110 (12-13) : 497 - 502
  • [23] Method of Workload Balancing in GPU Implementation of Breadth-First Search
    Chernoskutov, Mikhail
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 17 - 22
  • [24] Fast Breadth-First Search Approximation for Epidemic Source Inference
    Li, Congduan
    Chen, Siya
    Tan, Chee Wei
    2022 56TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2022, : 194 - 199
  • [25] Virtual network embedding algorithm based on breadth-first search
    Peng, Limin
    Sichuan Daxue Xuebao (Gongcheng Kexue Ban)/Journal of Sichuan University (Engineering Science Edition), 2015, 47 (02): : 117 - 122
  • [26] Preliminary study on the automatic parallelism optimization model for image enhancement algorithms based on Intel's® Xeon Phi
    Huang, Fang
    Yang, Hao
    Tao, Jian
    Wang, Jian
    Tan, Xicheng
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (16)
  • [27] Breadth-First Search Approach for Mining Serial Episodes with Simultaneous Events
    Gandreti, Santhosh B.
    Ibrahim, A.
    Sastry, P. S.
    PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 36 - 44
  • [28] Improving vertex-frontier based GPU breadth-first search
    杨博
    卢凯
    高颖慧
    徐凯
    王小平
    程志权
    Journal of Central South University, 2014, 21 (10) : 3828 - 3836
  • [29] An Implementation of Depth-First and Breadth-First Search Algorithms for Tip Selection in IOTA Distributed Ledger
    Ferenczi, Andras
    Badica, Costin
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT I, 2022, 13757 : 351 - 363
  • [30] Task-based Parallel Breadth-First Search in Heterogeneous Environments
    Munguia, Lluis-Miquel
    Bader, David A.
    Ayguade, Eduard
    2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,