DeepBugs: A learning approach to name-based bug detection

被引:191
|
作者
Pradel M. [1 ]
Sen K. [2 ,3 ]
机构
[1] TU Darmstadt, Department of Computer Science
[2] University of California, Berkeley
关键词
Bug detection; !text type='Java']Java[!/text]Script; Machine learning; Name-based program analysis; Natural language;
D O I
10.1145/3276517
中图分类号
学科分类号
摘要
Natural language elements in source code, e.g., the names of variables and functions, convey useful information. However, most existing bug detection tools ignore this information and therefore miss some classes of bugs. The few existing name-based bug detection approaches reason about names on a syntactic level and rely on manually designed and tuned algorithms to detect bugs. This paper presents DeepBugs, a learning approach to name-based bug detection, which reasons about names based on a semantic representation and which automatically learns bug detectors instead of manually writing them. We formulate bug detection as a binary classification problem and train a classifier that distinguishes correct from incorrect code. To address the challenge that effectively learning a bug detector requires examples of both correct and incorrect code, we create likely incorrect code examples from an existing corpus of code through simple code transformations. A novel insight learned from our work is that learning from artificially seeded bugs yields bug detectors that are effective at finding bugs in real-world code. We implement our idea into a framework for learning-based and name-based bug detection. Three bug detectors built on top of the framework detect accidentally swapped function arguments, incorrect binary operators, and incorrect operands in binary operations. Applying the approach to a corpus of 150,000 JavaScript files yields bug detectors that have a high accuracy (between 89% and 95%), are very efficient (less than 20 milliseconds per analyzed file), and reveal 102 programming mistakes (with 68% true positive rate) in real-world code. © 2018 Copyright held by the owner/author(s).
引用
收藏
相关论文
共 50 条
  • [41] On the Evaluation of the Machine Learning Based Hybrid Approach for Android Malware Detection
    Ratyal, Natasha Javed
    Khadam, Maryam
    Aleem, Muhammad
    2019 22ND IEEE INTERNATIONAL MULTI TOPIC CONFERENCE (INMIC), 2019, : 100 - 107
  • [42] A GPU-based machine learning approach for detection of botnet attacks
    Motylinski, Michal
    MacDermott, Aine
    Iqbal, Farkhund
    Shah, Babar
    COMPUTERS & SECURITY, 2022, 123
  • [43] An Instance-based Transfer Learning Approach, Applied to Intrusion Detection
    Kawish, Sonia
    Louafi, Habib
    Yao, Yiyu
    2023 20TH ANNUAL INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST, PST, 2023, : 187 - 193
  • [44] A deep learning approach for host-based cryptojacking malware detection
    Olanrewaju Sanda
    Michalis Pavlidis
    Nikolaos Polatidis
    Evolving Systems, 2024, 15 : 41 - 56
  • [45] A Dimensionality Reduction Approach for Machine Learning Based IoT Botnet Detection
    Susanto
    Stiawan, Deris
    Arifin, M. Agus Syamsul
    Rejito, Juli
    Idris, Mohd. Yazid
    Budiarto, Rahmat
    2021 8TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTERSCIENCE AND INFORMATICS (EECSI) 2021, 2021, : 26 - 30
  • [46] Attack Detection in Fog Layer for IIoT Based on Machine Learning Approach
    Maharani, Mareska Pratiwi
    Daely, Philip Tobianto
    Lee, Jae Min
    Kim, Dong-Seong
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1880 - 1882
  • [47] A Robust Malware Detection Approach for Android System Based on Ensemble Learning
    Li, Wenjia
    Cai, Juecong
    Wang, Zi
    Cheng, Sihua
    UBIQUITOUS SECURITY, 2022, 1557 : 309 - 321
  • [48] Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach
    Li, Haonan
    Hao, Yu
    Zhai, Yizhuo
    Qian, Zhiyun
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2024, 8 (OOPSLA):
  • [49] An Ensemble Learning Approach Based on TabNet and Machine Learning Models for Cheating Detection in Educational Tests
    Zhen, Yang
    Zhu, Xiaoyan
    EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 2024, 84 (04) : 780 - 809
  • [50] Network bullying detection based on deep learning
    Liu, Mengran
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2024, 24 (01) : 183 - 192