A Practical Black-Box Attack on Source Code Authorship Identification Classifiers

被引:11
|
作者
Liu, Qianjun [1 ]
Ji, Shouling [1 ]
Liu, Changchang [2 ]
Wu, Chunming [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[2] IBM Thomas J Watson Res Ctr, Dept Distributed AI, Yorktown Hts, NY 10598 USA
基金
中国国家自然科学基金;
关键词
Feature extraction; Tools; Training; Syntactics; Predictive models; Perturbation methods; Transforms; Source code; authorship identification; adversarial stylometry; ROBUSTNESS;
D O I
10.1109/TIFS.2021.3080507
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Existing researches have recently shown that adversarial stylometry of source code can confuse source code authorship identification (SCAI) models, which may threaten the security of related applications such as programmer attribution, software forensics, etc. In this work, we propose source code authorship disguise (SCAD) to automatically hide programmers' identities from authorship identification, which is more practical than the previous work that requires to known the output probabilities or internal details of the target SCAI model. Specifically, SCAD trains a substitute model and develops a set of semantically equivalent transformations, based on which the original code is modified towards a disguised style with small manipulations in lexical features and syntactic features. When evaluated under totally black-box settings, on a real-world dataset consisting of 1,600 programmers, SCAD induces state-of-the-art SCAI models to cause above 30% misclassification rates. The efficiency and utility-preserving properties of SCAD are also demonstrated with multiple metrics. Furthermore, our work can serve as a guideline for developing more robust identification methods in the future.
引用
收藏
页码:3620 / 3633
页数:14
相关论文
共 34 条
  • [21] Source-Free Black-Box Adaptation for Machine Fault Diagnosis
    Jiao, Jinyang
    Zhang, Tian
    Li, Hao
    Liu, Hanyang
    Lin, Jing
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025,
  • [22] The significance of user-defined identifiers in Java']Java source code authorship identification
    Frantzeskou, Georgia
    MacDonell, Stephen G.
    Stamatatos, Efstathios
    Georgiou, Stelios
    Gritzalis, Stefanos
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2011, 26 (02): : 123 - 132
  • [23] Coreset Learning-Based Sparse Black-Box Adversarial Attack for Video Recognition
    Chen, Jiefu
    Chen, Tong
    Xu, Xing
    Zhang, Jingran
    Yang, Yang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 1547 - 1560
  • [24] A Black-Box Attack Algorithm Targeting Unlabeled Industrial AI Systems With Contrastive Learning
    Duan, Mingxing
    Xiao, Guoqing
    Li, Kenli
    Xiao, Bin
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (04) : 6325 - 6335
  • [25] NUAT-GAN: Generating Black-Box Natural Universal Adversarial Triggers for Text Classifiers Using Generative Adversarial Networks
    Gao, Haoran
    Zhang, Hua
    Wang, Jiahui
    Zhang, Xin
    Wang, Huawei
    Li, Wenmin
    Tu, Tengfei
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 6484 - 6498
  • [26] Taking Care of the Discretization Problem: A Comprehensive Study of the Discretization Problem and a Black-Box Adversarial Attack in Discrete Integer Domain
    Bu, Lei
    Zhao, Zhe
    Duan, Yuchao
    Song, Fu
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2022, 19 (05) : 3200 - 3217
  • [27] Cube-Evo: A Query-Efficient Black-Box Attack on Video Classification System
    Zhan, Yu
    Fu, Ying
    Huang, Liang
    Guo, Jianmin
    Shi, Heyuan
    Song, Houbing
    Hu, Chao
    IEEE TRANSACTIONS ON RELIABILITY, 2024, 73 (02) : 1160 - 1171
  • [28] DifAttack: Query-Efficient Black-Box Adversarial Attack via Disentangled Feature Space
    Liu, Jun
    Zhou, Jiantao
    Zeng, Jiandian
    Tian, Jinyu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3666 - 3674
  • [29] GreedyFool: Multi-factor imperceptibility and its application to designing a black-box adversarial attack
    Liu, Hui
    Zhao, Bo
    Ji, Minzhi
    Li, Mengchen
    Liu, Peng
    INFORMATION SCIENCES, 2022, 613 : 717 - 730
  • [30] Perception-Driven Imperceptible Adversarial Attack Against Decision-Based Black-Box Models
    Zhang, Shenyi
    Zheng, Baolin
    Jiang, Peipei
    Zhao, Lingchen
    Shen, Chao
    Wang, Qian
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3164 - 3177