DeepBugs: A learning approach to name-based bug detection

被引:191
|
作者
Pradel M. [1 ]
Sen K. [2 ,3 ]
机构
[1] TU Darmstadt, Department of Computer Science
[2] University of California, Berkeley
关键词
Bug detection; !text type='Java']Java[!/text]Script; Machine learning; Name-based program analysis; Natural language;
D O I
10.1145/3276517
中图分类号
学科分类号
摘要
Natural language elements in source code, e.g., the names of variables and functions, convey useful information. However, most existing bug detection tools ignore this information and therefore miss some classes of bugs. The few existing name-based bug detection approaches reason about names on a syntactic level and rely on manually designed and tuned algorithms to detect bugs. This paper presents DeepBugs, a learning approach to name-based bug detection, which reasons about names based on a semantic representation and which automatically learns bug detectors instead of manually writing them. We formulate bug detection as a binary classification problem and train a classifier that distinguishes correct from incorrect code. To address the challenge that effectively learning a bug detector requires examples of both correct and incorrect code, we create likely incorrect code examples from an existing corpus of code through simple code transformations. A novel insight learned from our work is that learning from artificially seeded bugs yields bug detectors that are effective at finding bugs in real-world code. We implement our idea into a framework for learning-based and name-based bug detection. Three bug detectors built on top of the framework detect accidentally swapped function arguments, incorrect binary operators, and incorrect operands in binary operations. Applying the approach to a corpus of 150,000 JavaScript files yields bug detectors that have a high accuracy (between 89% and 95%), are very efficient (less than 20 milliseconds per analyzed file), and reveal 102 programming mistakes (with 68% true positive rate) in real-world code. © 2018 Copyright held by the owner/author(s).
引用
收藏
相关论文
共 50 条
  • [21] A Novel Deep Learning Based Approach for Breast Cancer Detection
    Aaqib, Muhammad
    Tufail, Muhammad
    Anwar, Shahzad
    2019 13TH INTERNATIONAL CONFERENCE ON MATHEMATICS, ACTUARIAL SCIENCE, COMPUTER SCIENCE AND STATISTICS (MACS-13), 2019,
  • [22] A novel deep learning-based approach for malware detection
    Shaukat, Kamran
    Luo, Suhuai
    Varadharajan, Vijay
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
  • [23] An efficient botnet detection approach based on feature learning and classification
    Padmavathi, B.
    Muthukumar, B.
    JOURNAL OF CONTROL AND DECISION, 2023, 10 (01) : 40 - 53
  • [24] A Machine Learning Based Approach to Crack Detection in Asphalt Pavements
    Balaji, A. Jayanth
    Balaji, Thiru G.
    Dinesh, M. S.
    Nair, Binoy B.
    Ram, D. S. Harish
    IEEE INDICON: 15TH IEEE INDIA COUNCIL INTERNATIONAL CONFERENCE, 2018,
  • [25] A Kernel Rootkit Detection Approach Based on Virtualization and Machine Learning
    Tian, Donghai
    Ma, Rui
    Jia, Xiaoqi
    Hu, Changzhen
    IEEE ACCESS, 2019, 7 : 91657 - 91666
  • [26] Phishing Attacks Detection A Machine Learning-Based Approach
    Salahdine, Fatima
    El Mrabet, Zakaria
    Kaabouch, Naima
    2021 IEEE 12TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2021, : 250 - 255
  • [27] Automatic Duplicate Bug Report Detection using Information Retrieval-based versus Machine Learning-based Approaches
    Neysiani, Behzad Soleimani
    Babamir, Seyed Morteza
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 288 - 293
  • [28] A Contextual Approach towards More Accurate Duplicate Bug Report Detection
    Alipour, Anahita
    Hindle, Abram
    Stroulia, Eleni
    2013 10TH IEEE WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2013, : 183 - 192
  • [29] API misuse bug detection based on sequence pattern matching
    Zeng J.
    Ben K.
    Zhang X.
    Xu Y.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2021, 49 (02): : 108 - 114and132
  • [30] A machine learning based approach for phishing detection using hyperlinks information
    Jain, Ankit Kumar
    Gupta, B. B.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2019, 10 (05) : 2015 - 2028