A Practical Black-Box Attack on Source Code Authorship Identification Classifiers

被引:11
|
作者
Liu, Qianjun [1 ]
Ji, Shouling [1 ]
Liu, Changchang [2 ]
Wu, Chunming [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[2] IBM Thomas J Watson Res Ctr, Dept Distributed AI, Yorktown Hts, NY 10598 USA
基金
中国国家自然科学基金;
关键词
Feature extraction; Tools; Training; Syntactics; Predictive models; Perturbation methods; Transforms; Source code; authorship identification; adversarial stylometry; ROBUSTNESS;
D O I
10.1109/TIFS.2021.3080507
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Existing researches have recently shown that adversarial stylometry of source code can confuse source code authorship identification (SCAI) models, which may threaten the security of related applications such as programmer attribution, software forensics, etc. In this work, we propose source code authorship disguise (SCAD) to automatically hide programmers' identities from authorship identification, which is more practical than the previous work that requires to known the output probabilities or internal details of the target SCAI model. Specifically, SCAD trains a substitute model and develops a set of semantically equivalent transformations, based on which the original code is modified towards a disguised style with small manipulations in lexical features and syntactic features. When evaluated under totally black-box settings, on a real-world dataset consisting of 1,600 programmers, SCAD induces state-of-the-art SCAI models to cause above 30% misclassification rates. The efficiency and utility-preserving properties of SCAD are also demonstrated with multiple metrics. Furthermore, our work can serve as a guideline for developing more robust identification methods in the future.
引用
收藏
页码:3620 / 3633
页数:14
相关论文
共 34 条
  • [1] Generalizable Black-Box Adversarial Attack With Meta Learning
    Yin, Fei
    Zhang, Yong
    Wu, Baoyuan
    Feng, Yan
    Zhang, Jingyi
    Fan, Yanbo
    Yang, Yujiu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1804 - 1818
  • [2] Context-Guided Black-Box Attack for Visual Tracking
    Huang, Xingsen
    Miao, Deshui
    Wang, Hongpeng
    Wang, Yaowei
    Li, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8824 - 8835
  • [3] Black-Box Based Limited Query Membership Inference Attack
    Zhang, Yu
    Zhou, Huaping
    Wang, Pengyan
    Yang, Gaoming
    IEEE ACCESS, 2022, 10 : 55459 - 55468
  • [4] Restricted Black-Box Adversarial Attack Against DeepFake Face Swapping
    Dong, Junhao
    Wang, Yuan
    Lai, Jianhuang
    Xie, Xiaohua
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 2596 - 2608
  • [5] Uncertainty-Based Rejection Wrappers for Black-Box Classifiers
    Mena, Jose
    Pujol, Oriol
    Vitria, Jordi
    IEEE ACCESS, 2020, 8 : 101721 - 101746
  • [6] Boosting Black-Box Attack to Deep Neural Networks With Conditional Diffusion Models
    Liu, Renyang
    Zhou, Wei
    Zhang, Tianwei
    Chen, Kangjie
    Zhao, Jun
    Lam, Kwok-Yan
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 5207 - 5219
  • [7] Spanning attack: reinforce black-box attacks with unlabeled data
    Wang, Lu
    Zhang, Huan
    Yi, Jinfeng
    Hsieh, Cho-Jui
    Jiang, Yuan
    MACHINE LEARNING, 2020, 109 (12) : 2349 - 2368
  • [8] Rearranging Pixels is a Powerful Black-Box Attack for RGB and Infrared Deep Learning Models
    Pomponi, Jary
    Dantoni, Daniele
    Alessandro, Nicolosi
    Scardapane, Simone
    IEEE ACCESS, 2023, 11 : 11298 - 11306
  • [9] An Approximated Gradient Sign Method Using Differential Evolution for Black-Box Adversarial Attack
    Li, Chao
    Wang, Handing
    Zhang, Jun
    Yao, Wen
    Jiang, Tingsong
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2022, 26 (05) : 976 - 990
  • [10] Stealthy Black-Box Attack With Dynamic Threshold Against MARL-Based Traffic Signal Control System
    Ren, Yan
    Zhang, Heng
    Du, Linkang
    Zhang, Zhikun
    Zhang, Jian
    Li, Hongran
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (10) : 12021 - 12031