Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?

被引:39
作者
Hisamoto, Sorami [1 ]
Post, Matt [2 ]
Duh, Kevin [2 ]
机构
[1] Works Applicat, Tokyo, Japan
[2] Johns Hopkins Univ, Baltimore, MD USA
关键词
Computational linguistics - Computer aided language translation - Data privacy;
D O I
10.1162/tacl_a_00299
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data privacy is an important issue for "machine learning as a service'' providers. We focus on the problem of membership inference attacks: Given a data sample and black-box access to a model's API, determine whether the sample existed in the model's training data. Our contribution is an investigation of this problem in the context of sequence-tosequence models, which are important in applications such as machine translation and video captioning. We define the membership inference problem for sequence generation, provide an open dataset based on state-of-the-artmachine translationmodels, and report initial results on whether thesemodels leakprivate informationagainst several kinds of membership inference attacks.
引用
收藏
页码:49 / 63
页数:15
相关论文
共 33 条
  • [1] [Anonymous], 2017, The Unintended Consequences of Overfitting: Training Data Inference Attacks, arXiv preprint
  • [2] Bojar O., 2016, Shared Task Papers, V2, P131, DOI DOI 10.18653/V1/W16-2301
  • [3] Bojar O., 2017, P 2 C MACH T, P169, DOI [10.18653/v1/w17-4717, DOI 10.18653/V1/W17-4717]
  • [4] Bojar Ondrej, 2018, P 3 C MACHINE TRANSL, P272, DOI [10.18653/v1/W18-6401, DOI 10.18653/V1/W18-6401]
  • [5] Carlini N., 2018, CORR
  • [6] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [7] Duh Kevin, 2018, The multitarget ted talks task
  • [8] Differential privacy: A survey of results
    Dwork, Cynthia
    [J]. THEORY AND APPLICATIONS OF MODELS OF COMPUTATION, PROCEEDINGS, 2008, 4978 : 1 - 19
  • [9] Hayes Jamie, 2017, arXiv, V7663
  • [10] Hieber F., 2017, ABS171205690 CORR