Detecting Data Races in OpenMP with Deep Learning and Large Language Models

被引:0
|
作者
Alsofyani, May [1 ]
Wang, Liqiang [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
来源
53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024 | 2024年
关键词
data race; race condition; bug detection; OpenMP; transformer encoder; large language model; CodeBERTa; GPT-4; Turbo;
D O I
10.1145/3677333.3678160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based neural network models are increasingly employed to handle software engineering issues, such as bug localization and program repair. These models, equipped with a self-attention mechanism, excel at understanding source code context and semantics. Recently, large language models (LLMs) have emerged as a promising alternative for analyzing and understanding code structure. In this paper, we propose two novel methods for detecting data race bugs in OpenMP programs. The first method is based on a transformer encoder trained from scratch. The second method leverages LLMs, specifically extending GPT-4 Turbo through the use of prompt engineering and fine-tuning techniques. For training and testing our approach, we utilized two datasets comprising different OpenMP directives. Our experiments show that the transformer encoder achieves competitive accuracy compared to LLMs, whether through fine-tuning or prompt engineering techniques. This performance may be attributed to the complexity of many OpenMP directives and the limited availability of labeled datasets.
引用
收藏
页码:96 / 103
页数:8
相关论文
共 50 条
  • [1] ARCHER: Effectively Spotting Data Races in Large OpenMP Applications
    Atzeni, Simone
    Gopalakrishnan, Ganesh
    Rakamaric, Zvonimir
    Ahn, Dong H.
    Laguna, Ignacio
    Schulz, Martin
    Lee, Gregory L.
    Protze, Joachim
    Mueller, Matthias S.
    Mueller, Matthias S.
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, : 53 - 62
  • [2] Developing an Interactive OpenMP Programming Book with Large Language Models
    Yi, Xinyao
    Wang, Anjia
    Yan, Yonghong
    Liao, Chunhua
    ADVANCING OPENMP FOR FUTURE ACCELERATORS, IWOMP 2024, 2024, 15195 : 176 - 194
  • [3] A Petri Nets Based Approach for Detecting the Data Races and Deadlocks in OpenMP Program
    Xian, Yulong
    Ding, Zhijun
    2012 THIRD INTERNATIONAL CONFERENCE ON THEORETICAL AND MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE (ICTMF 2012), 2013, 38 : 229 - 237
  • [4] Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models
    Deng, Yinlin
    Xia, Chunqiu Steven
    Peng, Haoran
    Yang, Chenyuan
    Zhan, Lingming
    PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, 2023, : 423 - 435
  • [5] Leveraging the Dynamic Program Structure Tree to Detect Data Races in OpenMP Programs
    Yu, Lechen
    Jin, Feiyang
    Protze, Joachim
    Sarkar, Vivek
    2022 IEEE/ACM SIXTH INTERNATIONAL WORKSHOP ON SOFTWARE CORRECTNESS FOR HPC APPLICATIONS (CORRECTNESS), 2022, : 54 - 62
  • [6] An assessment of large language models for OpenMP-based code parallelization: a user perspective
    Misic, Marko
    Dodovic, Matija
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [7] Deep Learning and Web Applications Vulnerabilities Detection: An Approach Based on Large Language Models
    Nana, Sidwendluian Romaric
    Bassole, Didier
    Guel, Desire
    Sie, Oumarou
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 1391 - 1399
  • [8] Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models
    Yue, Tianwei
    Wang, Yuanxin
    Zhang, Longxiang
    Gu, Chunming
    Xue, Haoru
    Wang, Wenping
    Lyu, Qi
    Dun, Yujie
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (21)
  • [9] Demystifying Data Management for Large Language Models
    Miao, Xupeng
    Jia, Zhihao
    Cui, Bin
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 547 - 555
  • [10] Does It Matter? - OMPSanitizer: An Impact Analyzer of Reported Data Races in OpenMP Programs
    Wang, Wenwen
    Lin, Pei-Hung
    PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 40 - 51