Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?

被引：47

作者：

Hisamoto, Sorami ^{[1
]}

Post, Matt ^{[2
]}

Duh, Kevin ^{[2
]}

机构：

[1] Works Applicat, Tokyo, Japan

[2] Johns Hopkins Univ, Baltimore, MD USA

来源：

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS | 2020年 / 8卷 / 08期

关键词：

Computational linguistics - Computer aided language translation - Data privacy;

D O I：

10.1162/tacl_a_00299

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data privacy is an important issue for "machine learning as a service'' providers. We focus on the problem of membership inference attacks: Given a data sample and black-box access to a model's API, determine whether the sample existed in the model's training data. Our contribution is an investigation of this problem in the context of sequence-tosequence models, which are important in applications such as machine translation and video captioning. We define the membership inference problem for sequence generation, provide an open dataset based on state-of-the-artmachine translationmodels, and report initial results on whether thesemodels leakprivate informationagainst several kinds of membership inference attacks.

引用

页码：49 / 63

页数：15

共 33 条

[21] Machine Learning with Membership Privacy using Adversarial Regularization [J].

Nasr, Milad ;

Shokri, Reza ;

Houmansadr, Amir .

PROCEEDINGS OF THE 2018 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'18), 2018, :634-646

[22] BLEU: a method for automatic evaluation of machine translation [J].

Papineni, K ;

Roukos, S ;

Ward, T ;

Zhu, WJ .

40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, :311-318

[23]

Post Matt, 2018, P 3 C MACHINE TRANSL, P186, DOI DOI 10.18653/V1/W18-6319

[24] Knock Knock, Who's There? Membership Inference on Aggregate Location Data [J].

Pyrgelis, Apostolos ;

Troncoso, Carmela ;

De Cristofaro, Emiliano .

25TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2018), 2018,

[25]

Rahman MA, 2018, TRANS DATA PRIV, V11, P61

[26]

Salem Ahmed, 2018, ARXIV180601246

[27]

Sennrich R, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1715

[28] Membership Inference Attacks Against Machine Learning Models [J].

Shokri, Reza ;

Stronati, Marco ;

Song, Congzheng ;

Shmatikov, Vitaly .

2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :3-18

[29] Auditing Data Provenance in Text-Generation Models [J].

Song, Congzheng ;

Shmatikov, Vitaly .

KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :196-206

[30]

Tiedemann J, 2012, LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P2214

← 1 2 3 4 →