共 27 条
- [11] Kumar V.B.Y., Joshi S., Patkar S.B., Et al., FPGA based high performance double-precision matrix multiplication, International Journal of Parallel Programming, 38, 3, pp. 322-338, (2010)
- [12] Wu G., Parallel algorithms and architectures for matrix computations on FPGA, (2011)
- [13] Wu G., Dou Y., Wang M., High performance and memory efficient implementation of matrix multiplication on FPGAs, Proc of the 9th IEEE Int Conf on Field Programmable Technology, pp. 134-137, (2010)
- [14] Zhou L., Tao Y., Liu S., Et al., Research on systolic multiplication and technology based on FPGA, Journal of Computer Science and Engineering, 37, 9, pp. 1632-1636, (2015)
- [15] Jovanovi Z., Milutinovi V., FPGA accelerator for floating-point matrix multiplication, IET Computers & Digital Techniques, 6, 4, pp. 249-256, (2012)
- [16] Lei Y., Chen X., Peng Y., A high energy efficiency FFT accelerator on DSP chip, Journal of Computer Research and Development, 53, 7, pp. 1438-1446, (2016)
- [17] Qian L., Zhao J., Peng D., Et al., Energy-efficient fingerprint matching based on reconfigurable micro server, Journal of Computer Research and Development, 53, 7, pp. 1425-1437, (2016)
- [18] Jouppi N.P., Young C., Patil N., Et al., In-datacenter performance analysis of a tensor processing unit, Proc of the 44th IEEE Int Symp on Computer Architecture, pp. 1-12, (2017)
- [19] Inside volta: The world's most advanced data-center GPU
- [20] Sze V., Chen Y.H., Suleiman, A, Et al., Hardware for machine learning: Challenges and opportunities, Proc of the 30th IEEE Custom Integrated Circuits Conf., pp. 299-306, (2017)