共 10 条
[1]
Calore E., 2016, EXPERIENCE VECTORIZI, P53
[2]
Kokkos: Enabling performance portability across manycore architectures
[J].
2013 EXTREME SCALING WORKSHOP (XSW 2013),
2014,
:18-24
[3]
Edwards HC, 2012, SCI PROGRAMMING-NETH, V20, P89, DOI [10.1155/2012/917630, 10.3233/SPR-2012-0343]
[4]
Esterie P., 2014, Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing, P1, DOI DOI 10.1145/2568058.2568063
[5]
Impact of Data Structure Layout on Performance
[J].
PROCEEDINGS OF THE 2013 21ST EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING,
2013,
:116-120
[6]
Karpinski P., 2017, P 8 INT WORKSH PROGR, P21, DOI DOI 10.1145/3026937.3026939
[7]
Vc: A C++ library for explicit vectorization
[J].
SOFTWARE-PRACTICE & EXPERIENCE,
2012, 42 (11)
:1409-1430
[8]
Majeti D., 2013, European Conference on Parallel Processing, P188
[9]
Majeti D., 2016, P 25 INT C COMPILER, P240
[10]
Shixiong Xu, 2014, Network and Parallel Computing. 11th IFIP WG 10.3 International Conference, NPC 2014. Proceedings: LNCS 8707, P485, DOI 10.1007/978-3-662-44917-2_40