Secure and Scalable Statistical Computation of Questionnaire Data in R

被引:6
作者
Yigzaw, Kassaye Yitbarek [1 ,2 ]
Michalas, Antonis [3 ]
Bellika, Johan Gustav [2 ,4 ]
机构
[1] UiT Arctic Univ Norway, Dept Comp Sci, N-9037 Tromso, Norway
[2] Univ Hosp North Norway, Norwegian Ctr E Hlth Res, N-9019 Tromso, Norway
[3] Univ Westminster, Dept Comp Sci, London W1W 6UW, England
[4] UiT Arctic Univ Norway, Dept Clin Med, N-9037 Tromso, Norway
来源
IEEE ACCESS | 2016年 / 4卷
关键词
Bloom filter; privacy; questionnaire; statistical analysis; secure multi-party computation; secret sharing; PERFORMANCE;
D O I
10.1109/ACCESS.2016.2599851
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Collecting data via a questionnaire and analyzing them while preserving respondents' privacy may increase the number of respondents and the truthfulness of their responses. It may also reduce the systematic differences between respondents and non-respondents. In this paper, we propose a privacy-preserving method for collecting and analyzing survey responses using secure multi-party computation. The method is secure under the semi-honest adversarial model. The proposed method computes a wide variety of statistics. Total and stratified statistical counts are computed using the secure protocols developed in this paper. Then, additional statistics, such as a contingency table, a chi-square test, an odds ratio, and logistic regression, are computed within the R statistical environment using the statistical counts as building blocks. The method was evaluated on a questionnaire data set of 3158 respondents sampled for a medical study and simulated questionnaire data sets of up to 50 000 respondents. The computation time for the statistical analyses linearly scales as the number of respondents increases. The results show that the method is efficient and scalable for practical use. It can also be used for other applications in which categorical data are collected.
引用
收藏
页码:4635 / 4645
页数:11
相关论文
共 37 条
  • [1] ADAM NR, 1989, COMPUT SURV, V21, P515, DOI 10.1145/76894.76895
  • [2] [Anonymous], 2010, P 19 USENIX C SEC US
  • [3] Beimel Amos, 2011, Coding and Cryptology. Proceedings of the Third International Workshop, IWCC 2011, P11, DOI 10.1007/978-3-642-20901-7_2
  • [4] BENALOH JC, 1987, LECT NOTES COMPUT SC, V263, P251
  • [5] SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS
    BLOOM, BH
    [J]. COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 422 - &
  • [6] Bogdanov D., 2010, TECH REP
  • [7] High-performance secure multi-party computation for data mining applications
    Bogdanov, Dan
    Niitsoo, Margus
    Toft, Tomas
    Willemson, Jan
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2012, 11 (06) : 403 - 418
  • [8] Bogdanov D, 2008, LECT NOTES COMPUT SC, V5283, P192
  • [9] Bogetoft P, 2009, LECT NOTES COMPUT SC, V5628, P325, DOI 10.1007/978-3-642-03549-4_20
  • [10] Universally composable security: A new paradigm for cryptographic protocols
    Canetti, R
    [J]. 42ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2001, : 136 - 145