Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals

被引:22
作者
Chen, Daizhuo [1 ]
Fraiberger, Samuel P. [2 ]
Moakler, Robert [3 ]
Provost, Foster [3 ]
机构
[1] Columbia Business Sch, New York, NY USA
[2] Northeastern Univ, Network Sci Inst, Boston, MA 02115 USA
[3] NYU, Stern Sch Business, 44 West Fourth St,8th Floor, New York, NY 10012 USA
关键词
predictive modeling; transparency; privacy; comprehensibility; inference; control; BIG DATA; CLASSIFICATIONS;
D O I
10.1089/big.2017.0074
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions. As a use case, we explore personal inferences that are made possible from Likes on Facebook. We first present a means for providing transparency into the information responsible for inferences drawn by data-driven models. We then introduce the cloaking device a mechanism for users to inhibit the use of particular pieces of information in inference. Using these analytical tools we ask two main questions: (1) How much information must users cloak to significantly affect inferences about their personal traits? We find that usually users must cloak only a small portion of their actions to inhibit inference. We also find that, encouragingly, false-positive inferences are significantly easier to cloak than true-positive inferences. (2) Can firms change their modeling behavior to make cloaking more difficult? The answer is a definitive yes. We demonstrate a simple modeling change that requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make control easier or harder for their users.
引用
收藏
页码:197 / 212
页数:16
相关论文
共 31 条
[1]  
[Anonymous], 2013, Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking
[2]  
[Anonymous], 2013, J PRIVACY CONFIDENTI, V4
[3]  
[Anonymous], THESIS
[4]  
Bachrach Y, 2012, PROCEEDINGS OF THE 3RD ANNUAL ACM WEB SCIENCE CONFERENCE, 2012, P24
[5]  
Barocas S, BIG DATA DATA SCI CI
[6]   Explaining machine learning models in sales predictions [J].
Bohanec, Marko ;
Borstnar, Mirjana Kljajic ;
Robnik-Sikonja, Marko .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 71 :416-428
[7]  
CHEN D, 2015, 245133969 NYU
[8]  
Chen D., 2016, 2016 ICML WORKSH HUM, P21
[9]   Differentially Private High-Dimensional Data Publication via Sampling-Based Inference [J].
Chen, Rui ;
Xiao, Qian ;
Zhang, Yu ;
Xu, Jianliang .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :129-138
[10]   Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems [J].
Datta, Anupam ;
Sen, Shayak ;
Zick, Yair .
2016 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2016, :598-617