Automatic Operator Performance Tuning in a Machine Learning System on Edge

被引:0
|
作者
Xu, Peng [1 ]
Chang, Xinyu [1 ]
Zhao, Jianxin [1 ]
Liu, Chi Harold [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing, Peoples R China
来源
2022 IEEE 28TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, ICPADS | 2022年
基金
中国国家自然科学基金;
关键词
optimization; automatic tuning; convolution; machine learning system;
D O I
10.1109/ICPADS56603.2022.00109
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the current large scale deployment of machine learning technologies, such as those on cloud servers and edge and IoT hardwares, machine learning systems have been widely prevalence. Practical requirement has driven their performance increase in both academia and industry. However, the application requirement varies greatly across different applications, and directly using off-the-shelf systems might not be sufficient in many cases. In this work, we first propose to implement a series of techniques to optimize performance of convolution operation, one of the most important operations, in constructing deep learning networks. Besides, we also propose to apply the automated empirical optimisation of software approach to improve the performance of operators in machine learning system, most notably across various hardware platforms. Evaluation compared to existing libraries on different hardware devices has proved the efficiency of our proposed method.
引用
收藏
页码:802 / 809
页数:8
相关论文
共 50 条
  • [41] Special issue on automatic performance tuning
    Yelick, K
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2004, 18 (01): : 19 - 19
  • [42] ADSTS: Automatic Distributed Storage Tuning System Using Deep Reinforcement Learning
    Lu, Kai
    Li, Guokuan
    Wan, Jiguang
    Ma, Ruixiang
    Zhao, Wei
    51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [43] Automatic Performance Tuning (iWAPT 2023)
    2023 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2023, 2023, : 702 - 703
  • [44] Decentralized machine learning on the edge
    Communications in Computer and Information Science, 2019, 967
  • [45] Automatic Machine Learning Model Construction and Performance Validation in Face Recognition
    Shi, Yitao
    Tang, Haiyang
    Wang, Zhuhui
    Zhu, Weijie
    PROCEEDINGS OF THE WORLD CONFERENCE ON INTELLIGENT AND 3-D TECHNOLOGIES, WCI3DT 2022, 2023, 323 : 23 - 41
  • [46] Operationalize machine learning at the edge
    McCampbell, Jason
    Control Engineering, 2022, 69 (10) : 28 - 29
  • [47] Operationalize machine learning at the edge
    McCampbell, Jason
    Plant Engineering, 2022, 76 (06) : 46 - 47
  • [48] Automatic Generation of High-Performance Quantized Machine Learning Kernels
    Cowan, Meghan
    Moreau, Thierry
    Chen, Tianqi
    Bornholt, James
    Ceze, Luis
    CGO'20: PROCEEDINGS OF THE18TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2020, : 305 - 316
  • [49] Machine Learning Based Auto-tuning for Enhanced OpenCL Performance Portability
    Falch, Thomas L.
    Elster, Anne C.
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1231 - 1240
  • [50] Stochastic performance tuning of complex simulation applications using unsupervised machine learning
    Shadura, Oksana
    Carminati, Federico
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,