Detecting Data Races in OpenMP with Deep Learning and Large Language Models

被引:0
|
作者
Alsofyani, May [1 ]
Wang, Liqiang [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
来源
53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024 | 2024年
关键词
data race; race condition; bug detection; OpenMP; transformer encoder; large language model; CodeBERTa; GPT-4; Turbo;
D O I
10.1145/3677333.3678160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based neural network models are increasingly employed to handle software engineering issues, such as bug localization and program repair. These models, equipped with a self-attention mechanism, excel at understanding source code context and semantics. Recently, large language models (LLMs) have emerged as a promising alternative for analyzing and understanding code structure. In this paper, we propose two novel methods for detecting data race bugs in OpenMP programs. The first method is based on a transformer encoder trained from scratch. The second method leverages LLMs, specifically extending GPT-4 Turbo through the use of prompt engineering and fine-tuning techniques. For training and testing our approach, we utilized two datasets comprising different OpenMP directives. Our experiments show that the transformer encoder achieves competitive accuracy compared to LLMs, whether through fine-tuning or prompt engineering techniques. This performance may be attributed to the complexity of many OpenMP directives and the limited availability of labeled datasets.
引用
收藏
页码:96 / 103
页数:8
相关论文
共 50 条
  • [21] Data augmentation based on large language models for radiological report classification
    Collado-Montanez, Jaime
    Martin-Valdivia, Maria-Teresa
    Martinez-Camara, Eugenio
    KNOWLEDGE-BASED SYSTEMS, 2025, 308
  • [22] MediGPT: Exploring Potentials of Conventional and Large Language Models on Medical Data
    Rony, Mohammad Abu Tareq
    Islam, Mohammad Shariful
    Sultan, Tipu
    Alshathri, Samah
    El-Shafai, Walid
    IEEE ACCESS, 2024, 12 : 103473 - 103487
  • [23] Understanding Sarcoidosis Using Large Language Models and Social Media Data
    Xi, Nan Miles
    Ji, Hong-Long
    Wang, Lin
    JOURNAL OF HEALTHCARE INFORMATICS RESEARCH, 2024,
  • [24] Are Large Language Models Capable of Causal Reasoning for Sensing Data Analysis?
    Hu, Zhizhang
    Zhang, Yue
    Rossi, Ryan
    Yu, Tong
    Kim, Sungchul
    Pan, Shijia
    PROCEEDINGS OF THE 2024 WORKSHOP ON EDGE AND MOBILE FOUNDATION MODELS, EDGEFM 2024, 2024, : 24 - 29
  • [25] Detecting Data Races Caused by Inconsistent Lock Protection in Device Drivers
    Chen, Qiu-Liang
    Bai, Jia-Ju
    Jiang, Zu-Ming
    Lawall, Julia
    Hu, Shi-Min
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 366 - 376
  • [26] LLM4RL: Enhancing Reinforcement Learning with Large Language Models
    Zhou, Jiehan
    Zhao, Yang
    Liu, Jiahong
    Dong, Peijun
    Luo, Xiaoyu
    Tao, Hang
    Chang, Shi
    Luo, Hanjiang
    2024 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE 2024, 2024, : 86 - 87
  • [27] Reward Design Using Large Language Models for Natural Language Explanation of Reinforcement Learning Agent Actions
    Masadome, Shinya
    Harada, Taku
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2025,
  • [28] Carbon-based molecular properties efficiently predicted by deep learning-based quantum chemical simulation with large language models
    Wang H.
    Chen B.
    Sun H.
    Zhang Y.
    Computers in Biology and Medicine, 2024, 176
  • [29] An Experimental Research of Text-to-SQL for Heterogeneous Data in Large Language Models
    Yang, Weiwei
    Wang, Xiaoliang
    Chen, Bosheng
    Liu, Yong
    Wang, Bing
    Wang, Hui
    Wang, Xiaoke
    Zhua, Haitao
    Wang, Zhehao
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT I, ICIC 2024, 2024, 14875 : 378 - 389
  • [30] GRace: A Low-Overhead Mechanism for Detecting Data Races in GPU Programs
    Zheng, Mai
    Ravi, Vignesh T.
    Qin, Feng
    Agrawal, Gagan
    ACM SIGPLAN NOTICES, 2011, 46 (08) : 135 - 145