Generating Code and Memory Buffers to Reorganize Data on Many-core Architectures

被引:8
作者
Cudennec, Loic [1 ]
Dubrulle, Paul [1 ]
Galea, Francois [1 ]
Goubier, Thierry [1 ]
Sirdey, Renaud [1 ]
机构
[1] CEA, LIST, Saclay, France
来源
2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE | 2014年 / 29卷
关键词
Many-core; Dataflow; Compilation; Data reorganization; COMPILER;
D O I
10.1016/j.procs.2014.05.101
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The dataflow programming model has shown to be a relevant approach to efficiently run massively parallel applications over many-core architectures. In this model, some particular builtin agents are in charge of data reorganizations between user agents. Such agents can Split, Join and Duplicate data onto their communication ports. They are widely used in signal processing for example. These system agents, and their associated implementations, are of major importance when it comes to performance, because they can stand on the critical path (think about Amdhal's law). Furthermore, a particular data reorganization can be expressed by the developer in several ways that may lead to inefficient solutions (mostly unneeded data copies and transfers). In this paper, we propose several strategies to manage data reorganization at compile time, with a focus on indexed accesses to shared buffers to avoid data copies. These strategies are complementary: they ensure correctness for each system agent configuration, as well as performance when possible. They have been implemented within the Sigma-C industry-grade compilation toolchain and evaluated over the Kalray MPPA 256-core processor.
引用
收藏
页码:1123 / 1133
页数:11
相关论文
共 13 条
  • [1] Language and compiler design for streaming applications
    Amarasinghe, S
    Gordon, MI
    Karczmarek, M
    Lin, J
    Maze, D
    Rabbah, RM
    Thies, W
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2005, 33 (2-3) : 261 - 278
  • [2] Extended Cyclostatic Dataflow Program Compilation and Execution for an Integrated Manycore Processor
    Aubry, Pascal
    Beaucamps, Pierre-Edouard
    Blanc, Frederic
    Bodin, Bruno
    Carpov, Sergiu
    Cudennec, Loic
    David, Vincent
    Dore, Philippe
    Dubrulle, Paul
    Dupont de Dinechin, Benoit
    Galea, Francois
    Goubier, Thierry
    Harrand, Michel
    Jones, Samuel
    Lesage, Jean-Denis
    Louise, Stephane
    Morey Chaisemartin, Nicolas
    Thanh Hai Nguyen
    Raynaud, Xavier
    Sirdey, Renaud
    [J]. 2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 1624 - 1633
  • [3] Bartenstein TW, 2013, PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), P532, DOI 10.1109/ICSE.2013.6606599
  • [4] Cyclo-static dataflow
    Bilsen, G
    Engels, M
    Lauwereins, R
    Peperstraete, J
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1996, 44 (02) : 397 - 408
  • [5] Throughput constrained parallelism reduction in cyclo-static dataflow applications
    Carpov, Sergiu
    Cudennec, Loic
    Sirdey, Renaud
    [J]. 2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 30 - 39
  • [6] Cudennec L., 2012, P 12 INT C COMP SCI
  • [7] Eker J., 2003, MO348 UCBERL EECS DE
  • [8] A parallel simulated annealing approach for the mapping of large process networks
    Galea, Francois
    Sirdey, Renaud
    [J]. 2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 1787 - 1792
  • [9] A stream compiler for communication-exposed architectures
    Gordon, MI
    Thies, W
    Karczmarek, M
    Lin, J
    Meli, AS
    Lamb, AA
    Leger, C
    Wong, J
    Hoffmann, H
    Maze, D
    Amarasinghe, S
    [J]. ACM SIGPLAN NOTICES, 2002, 37 (10) : 291 - 303
  • [10] Goubier T, 2011, LECT NOTES COMPUT SC, V7916, P385, DOI 10.1007/978-3-642-24650-0_33