Can Large Language Models Write Parallel Code?

被引：3

作者：

Nichols, Daniel ^{[1
]}

Davis, Joshua H. ^{[1
]}

Xie, Zhaojun ^{[1
]}

Rajaram, Arjun ^{[1
]}

Bhatele, Abhinav ^{[1
]}

机构：

[1] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA

来源：

PROCEEDINGS OF THE 33RD INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2024 | 2024年

基金：

美国国家科学基金会;

关键词：

Large language models; Parallel code generation; Performance evaluation; Benchmarking; HPC;

D O I：

10.1145/3625549.3658689

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large language models are increasingly becoming a popular tool for software development. Their ability to model and generate source code has been demonstrated in a variety of contexts, including code completion, summarization, translation, and lookup. However, they often struggle to generate code for complex programs. In this paper, we study the capabilities of state-of-the-art language models to generate parallel code. In order to evaluate language models, we create a benchmark, PAREVAL, consisting of prompts that represent 420 different coding tasks related to scientific and parallel computing. We use PAREVAL to evaluate the effectiveness of several state-of-the-art open- and closed-source language models on these tasks. We introduce novel metrics for evaluating the performance of generated code, and use them to explore how well each large language model performs for 12 different computational problem types and six different parallel programming models.

引用

页数：14

共 49 条

[1]

2023, Arxiv, DOI [arXiv:2303.08774, DOI 10.48550/ARXIV.2303.08774, 10.48550/arXiv.2303.08774]

[2]

Ahmed Toufique, 2022, arXiv

[3]

Allal LB, 2023, Arxiv, DOI arXiv:2301.03988

[4]

Austin J., 2021, arXiv, DOI DOI 10.48550/ARXIV.2108.07732

[5] An Empirical Study of High Performance Computing (HPC) Performance Bugs [J].

Azad, Md Abul Kalam ;

Iqbal, Nafees ;

Hassan, Foyzul ;

Roy, Probir .

2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, :194-206

[6]

Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165]

[7] MultiPL-E: A Scalable and Polyglot Approach to Benchmarking Neural Code Generation [J].

Cassano, Federico ;

Gouwar, John ;

Nguyen, Daniel ;

Nguyen, Sydney ;

Phipps-Costin, Luna ;

Pinckney, Donald ;

Yee, Ming-Ho ;

Zi, Yangtian ;

Anderson, Carolyn Jane ;

Feldman, Molly Q. ;

Guha, Arjun ;

Greenberg, Michael ;

Jangda, Abhinav .

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (07) :3675-3691

[8] LM4HPC: Towards Effective Language Model Application in High-Performance Computing [J].

Chen, Le ;

Lin, Pei-Hung ;

Vanderbruggen, Tristan ;

Liao, Chunhua ;

Emani, Murali ;

de Supinski, Bronis .

OPENMP: ADVANCED TASK-BASED, DEVICE AND COMPILER PROGRAMMING, IWOMP 2023, 2023, 14114 :18-33

[9]

Chen L, 2023, Arxiv, DOI [arXiv:2308.07505, 10.48550/arXiv.2308.07505, DOI 10.48550/ARXIV.2308.07505]

[10]

Chen M., 2021, arXiv

← 1 2 3 4 5 →