Characterization and Machine Learning Classification of AI and PC Workloads

被引：1

作者：

Sibai, Fadi N. ^{[1
]}

Asaduzzaman, Abu ^{[2
]}

El-Moursy, Ali ^{[3
]}

机构：

[1] Gulf Univ Sci & Technol, Dept Elect & Comp Engn, Mubarak Al Abdullah 32093, Kuwait

[2] Wichita State Univ, Elect & Comp Engn Dept, Wichita, KS USA

[3] Univ Sharjah, Elect & Comp Engn Dept, Sharjah, U Arab Emirates

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Benchmark testing; Artificial intelligence; Computational modeling; Machine learning; Training; Graphics processing units; Program processors; AI workloads; Tensorflow; PassMark PerformanceTest; AIBench; workload characterization; event counts; benchmark profiling; machine learning classification; VTune;

D O I：

10.1109/ACCESS.2024.3413199

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To better design AI processors, it is critical to characterize artificial intelligence (AI) workloads and contrast them to normal personal computer (PC) workloads. In this work, we profiled the AIBench and PassMark PerformanceTest benchmarks with the Intel oneAPI VTune Profiler on a multi-core computer. We captured and contrasted the various CPU and platform metrics and event counts for these two distinct benchmarks. Using the Orange 3.0 data mining tool, and based on the captured profile metrics and event counts, we then trained and tested 9 machine learning (ML) models to classify the CPIs and elapsed times of the various tests of these two benchmarks, including inference and training tests in AIBench, and CPU, memory, graphics, and disk tests in PassMark. The linear regression machine learning model emerged as the best clocks per instruction (CPI) classifier, while the neural network model with 4 hidden layers was the best elapsed time classifier. This machine learning classification can help in predicting the CPI and elapsed time and distinguish between AI and standard PC workloads based on the profiled application(s) and captured profile metrics and event counts. The stressed computer units identified by this detailed profiling work and exercised by the benchmark tests can also guide future AI processor design improvements.

引用

页码：83858 / 83875

页数：18

共 38 条

[1] Impact of CUDA and OpenCL on Parallel and Distributed Computing [J].

Abu Asaduzzaman ;

Trent, Alec ;

Osborne, S. ;

Aldershof, C. ;

Sibai, Fadi N. .

2021 8TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2021), 2021, :238-242

[2] Novel Casestudy and Benchmarking of AlexNet for Edge AI: From CPU and GPU to FPGA [J].

Al-Ali, Firas ;

Gamage, Thilina Doremure ;

Nanayakkara, Hewa W. T. S. ;

Mehdipour, Farhad ;

Ray, Sayan Kumar .

2020 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2020,

[3] Koios: A Deep Learning Benchmark Suite for FPGA Architecture and CAD Research [J].

Arora, Aman ;

Boutros, Andrew ;

Rauch, Daniel ;

Rajen, Aishwarya ;

Borda, Aatman ;

Damghani, Seyed Alireza ;

Mehta, Samidh ;

Kate, Sangram ;

Patel, Pragnesh ;

Kent, Kenneth B. ;

Betz, Vaughn ;

John, Lizy K. .

2021 31ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2021), 2021, :355-362

[4]

Asaduzzaman A, 2013, IEEE INT CONF INNOV

[5] Impact of L1 Entire Locking and L2 Way Locking on the Performance, Power Consumption, and Predictability of Multicore Real-Time Systems [J].

Asaduzzaman, Abu ;

Mahgoub, Imad ;

Sibai, Fadi N. .

2009 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2009, :705-+

[6]

benchcouncil, AIBench Training

[7]

benchcouncil, AIBench Tutorial

[8]

BenchCouncil AIBench, About us

[9] Benchmarking Contemporary Deep Learning Hardware and Frameworks:A Survey of Qualitative Metrics [J].

Dai, Wei ;

Berleant, Daniel .

2019 IEEE FIRST INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2019), 2019, :148-155

[10]

Davis E., 2023, ACM Computing Surveys, V56, P1

← 1 2 3 4 →