ROWBACK: RObust Watermarking for neural networks using BACKdoors

被引:3
作者
Chattopadhyay, Nandish [1 ]
Chattopadhyay, Anupam [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
来源
20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021) | 2021年
关键词
watermarking neural networks; robustness; backdooring; adversarial samples;
D O I
10.1109/ICMLA52953.2021.00274
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Claiming ownership of trained neural networks is critical towards stakeholders investing heavily in high performance neural networks. There is an associated cost for the entire pipeline starting from data curation to high performance computing infrastructure for neural architecture search and training the model. Watermarking neural networks is a potential solution to the problem, but standard techniques suffer from vulnerabilities demonstrated by attackers. In this paper, we propose a robust watermarking mechanism for neural architectures. Our proposed method ROWBACK turns two properties of neural networks, the presence of adversarial examples and the ability to trap backdoors in the network while training, into a scheme that guarantees strong proofs of ownership. We redesign the Trigger Set for watermarking using adversarial examples of the model which needs to be watermarked, and assign specific labels based on adversarial behaviour. We also mark every layer separately, during training, in order to ensure that removing watermarks requires complete retraining. We have tested ROWBACK for satisfying key indicative properties expected of a reliable watermarking scheme (generates accuracies within 1 - 2% of actual model, and a complete 100% match on the Trigger Set for verification), whilst being robust against state-of-the-art watermark removal attacks [1] (requires re-training of all layers with at least 60% samples and for at least more than 45% of epochs of actual training).
引用
收藏
页码:1728 / 1735
页数:8
相关论文
共 31 条
  • [1] Adi Y, 2018, PROCEEDINGS OF THE 27TH USENIX SECURITY SYMPOSIUM, P1615
  • [2] Al-Rfou, 2018, ARXIV PREPRINT ARXIV
  • [3] [Anonymous], 2018, Adversarial attacks and defences: A survey
  • [4] [Anonymous], DEEPSIGNS GENERIC WA
  • [5] Chattopadhyay Nandish, 2020, Security, Privacy, and Applied Cryptography Engineering. 10th International Conference, SPACE 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12586), P46, DOI 10.1007/978-3-030-66626-2_3
  • [6] Chen HL, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4658
  • [7] Chen X., 2019, ARXIV PREPRINT ARXIV
  • [8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [9] Making Machine Learning Robust Against Adversarial Inputs
    Goodfellow, Ian
    McDaniel, Patrick
    Papernot, Nicolas
    [J]. COMMUNICATIONS OF THE ACM, 2018, 61 (07) : 56 - 66
  • [10] Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1