Detecting Data Races in OpenMP with Deep Learning and Large Language Models

被引:0
|
作者
Alsofyani, May [1 ]
Wang, Liqiang [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
来源
53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024 | 2024年
关键词
data race; race condition; bug detection; OpenMP; transformer encoder; large language model; CodeBERTa; GPT-4; Turbo;
D O I
10.1145/3677333.3678160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based neural network models are increasingly employed to handle software engineering issues, such as bug localization and program repair. These models, equipped with a self-attention mechanism, excel at understanding source code context and semantics. Recently, large language models (LLMs) have emerged as a promising alternative for analyzing and understanding code structure. In this paper, we propose two novel methods for detecting data race bugs in OpenMP programs. The first method is based on a transformer encoder trained from scratch. The second method leverages LLMs, specifically extending GPT-4 Turbo through the use of prompt engineering and fine-tuning techniques. For training and testing our approach, we utilized two datasets comprising different OpenMP directives. Our experiments show that the transformer encoder achieves competitive accuracy compared to LLMs, whether through fine-tuning or prompt engineering techniques. This performance may be attributed to the complexity of many OpenMP directives and the limited availability of labeled datasets.
引用
收藏
页码:96 / 103
页数:8
相关论文
共 50 条
  • [41] Large Language Models in Cosmetic Dermatology
    Landau, Marina
    Kroumpouzos, George
    Goldust, Mohamad
    JOURNAL OF COSMETIC DERMATOLOGY, 2025, 24 (02)
  • [42] Consumer segmentation with large language models
    Li, Yinan
    Liu, Ying
    Yu, Muran
    JOURNAL OF RETAILING AND CONSUMER SERVICES, 2025, 82
  • [43] Applications of Large Language Models in Pathology
    Cheng, Jerome
    BIOENGINEERING-BASEL, 2024, 11 (04):
  • [44] Enhanced Database Interaction using Large Language Models for Improved Data Retrieval and Analysis
    Usha, V
    Abhinash, Nalagarla Chiru
    Chowdary, Sakhamuri Nitin
    Sathya, V
    Reddy, Eeda Ramakrishna
    Priya, Sathiya S.
    2024 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT CYBER PHYSICAL SYSTEMS AND INTERNET OF THINGS, ICOICI 2024, 2024, : 1302 - 1306
  • [45] Beyond large language models: rediscovering the role of classical statistics in modern data science
    Gutierrez, Inmaculada
    Gomez, Daniel
    Castro, Javier
    Bimber, Bruce
    Labarre, Julien
    2024 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ-IEEE 2024, 2024,
  • [46] A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics
    He, Kai
    Mao, Rui
    Lin, Qika
    Ruan, Yucheng
    Lan, Xiang
    Feng, Mengling
    Cambria, Erik
    INFORMATION FUSION, 2025, 118
  • [47] Decomposing Relational Triple Extraction with Large Language Models for Better Generalization on Unseen Data
    Meng, Boyu
    Lin, Tianhe
    Yang, Deqing
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT IV, PAKDD 2024, 2024, 14648 : 104 - 115
  • [48] Quo Vadis ChatGPT? From large language models to Large Knowledge Models
    Venkatasubramanian, Venkat
    Chakraborty, Arijit
    COMPUTERS & CHEMICAL ENGINEERING, 2025, 192
  • [49] Detecting Data Races in Android Applications Based on Shared Variable Analysis and Constraint Solver
    Sun Q.
    Xu L.
    Xia X.-M.
    Zhang W.-F.
    Ruan Jian Xue Bao/Journal of Software, 2019, 30 (11): : 3281 - 3296
  • [50] GMRace: Detecting Data Races in GPU Programs via a Low-Overhead Scheme
    Zheng, Mai
    Ravi, Vignesh T.
    Qin, Feng
    Agrawal, Gagan
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (01) : 104 - 115