共 38 条
[11]
Kokkos: Enabling performance portability across manycore architectures
[J].
2013 EXTREME SCALING WORKSHOP (XSW 2013),
2014,
:18-24
[12]
Georganas E., 2018, P SC18 INT C HIGH PE, P830
[13]
Towards Cross-Platform Performance Portability of DNN Models using SYCL
[J].
PROCEEDINGS OF 2020 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE, PORTABILITY AND PRODUCTIVITY IN HPC (P3HPC 2020),
2020,
:25-35
[14]
Gomez-Hernandez EJ., 2020, 13 INT WORKSH PROGR, P11
[15]
Guo K., 2021, NEURAL NETWORK ACCEL
[17]
Hill, 2020, ACCELERATOR LEVEL PA
[18]
Intel, 2020, ONEAPI SPEC
[19]
Caffe: Convolutional Architecture for Fast Feature Embedding
[J].
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14),
2014,
:675-678
[20]
In-Datacenter Performance Analysis of a Tensor Processing Unit
[J].
44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017),
2017,
:1-12