Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals

被引:22
作者
Chen, Daizhuo [1 ]
Fraiberger, Samuel P. [2 ]
Moakler, Robert [3 ]
Provost, Foster [3 ]
机构
[1] Columbia Business Sch, New York, NY USA
[2] Northeastern Univ, Network Sci Inst, Boston, MA 02115 USA
[3] NYU, Stern Sch Business, 44 West Fourth St,8th Floor, New York, NY 10012 USA
关键词
predictive modeling; transparency; privacy; comprehensibility; inference; control; BIG DATA; CLASSIFICATIONS;
D O I
10.1089/big.2017.0074
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions. As a use case, we explore personal inferences that are made possible from Likes on Facebook. We first present a means for providing transparency into the information responsible for inferences drawn by data-driven models. We then introduce the cloaking device a mechanism for users to inhibit the use of particular pieces of information in inference. Using these analytical tools we ask two main questions: (1) How much information must users cloak to significantly affect inferences about their personal traits? We find that usually users must cloak only a small portion of their actions to inhibit inference. We also find that, encouragingly, false-positive inferences are significantly easier to cloak than true-positive inferences. (2) Can firms change their modeling behavior to make cloaking more difficult? The answer is a definitive yes. We demonstrate a simple modeling change that requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make control easier or harder for their users.
引用
收藏
页码:197 / 212
页数:16
相关论文
共 31 条
[11]   PREDICTIVE MODELING WITH BIG DATA: Is Bigger Really Better? [J].
de Fortuny, Enric Junque ;
Martens, David ;
Provost, Foster .
BIG DATA, 2013, 1 (04) :BD215-+
[12]   Corporate Residence Fraud Detection [J].
de Fortuny, Enric Junque ;
Stankova, Marija ;
Moeyersoms, Julie ;
Minnaert, Bart ;
Provost, Foster ;
Martens, David .
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, :1650-1659
[13]   On the anonymization of sparse high-dimensional data [J].
Ghinita, Gabriel ;
Tao, Yufei ;
Kalnis, Panos .
2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, :715-+
[14]  
Johnson M., 2012, P 8 S US PRIV SEC, P9
[15]  
Knijnenburg B. P., 2012, P 6 ACM C REC SYST, P43, DOI [10.1145/2365952.2365966, DOI 10.1145/2365952.2365966]
[16]  
Knijnenburg BP., 2013, 34 INT C INF SYST, P1
[17]   Private traits and attributes are predictable from digital records of human behavior [J].
Kosinski, Michal ;
Stillwell, David ;
Graepel, Thore .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (15) :5802-5805
[18]  
Macskassy SA, 2007, J MACH LEARN RES, V8, P935
[19]  
Martens David, 2014, MIS Quarterly, V38, P73
[20]  
Mitchell Thomas M., 1997, MACHINE LEARNING ADD