共 15 条
- [1] [Anonymous], 2014, ACM INT C SUP 25 ANN
- [2] [Anonymous], 2018, Math kernel library
- [3] [Anonymous], 2016, Intel Xeon Phi Processor High Performance Programming, DOI [10.1016/B978-0-12-809194-4.00022-3, DOI 10.1016/B978-0-12-809194-4.00022-3]
- [4] Anatomy of high-performance matrix multiplication [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2008, 34 (03):
- [5] Gunnels J. A., 2001, Computational Science - ICCS 2001. International Conference. Proceedings, Part I (Lecture Notes in Computer Science Vol.2073), P51
- [6] Design and Implementation of the Linpack Benchmark for Single and Multi-Node Systems Based on Intel® Xeon Phi™ Coprocessor [J]. IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 126 - 137
- [7] OpenMP-based parallel implementation of matrix-matrix multiplication on the Intel Knights Landing [J]. HPC ASIA'18: PROCEEDINGS OF WORKSHOPS OF HPC ASIA, 2018, : 63 - 66
- [8] An implementation of matrix-matrix multiplication on the Intel KNL processor with AVX-512 [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (04): : 1785 - 1795
- [9] Analytical Modeling Is Enough for High-Performance BLIS [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2016, 43 (02):
- [10] Anatomy of High-Performance Many-Threaded Matrix Multiplication [J]. 2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,