Towards efficient and secure analysis of large datasets

被引：0

作者：

Cimato, Stelvio ^{[1
]}

Nicolo, Stefano ^{[1
]}

机构：

[1] Univ Milan, Dipartimento Informat, Milan, Italy

来源：

2020 IEEE 44TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2020) | 2020年

关键词：

machine learning; privacy preserving techniques; secure multi-party computation;

D O I：

10.1109/COMPSAC48688.2020.00-68

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

One of the promises of the "big data" revolution is that trough the analysis of large datasets people will benefit from the solution to many different problems obtained by the deployment of advanced machine learning models. One of the challenges of this standard approach, is that information needs to be centralized on the data center or the machine where the training phase is performed, posing many concerns about privacy. In this paper we take a step towards secure and efficient processing of distributed large datasets, where original data reside at different locations and are processed in a privacy preserving way. In particular we rely on the available technologies to achieve the secure design of a machine learning model by performing the training phase on encrypted data. The case study we examine is focused on the forecasting of energy production by wind farms situated in different locations. We show in detail how the machine learning model is created on the basis of the available datasets, we compare the results with the ones produced by the previous models, and discuss also their performances.

引用

页码：1351 / 1356

页数：6

共 50 条

[1] Towards Secure and Efficient Outsourcing of Machine Learning Classification
Zheng, Yifeng
Duan, Huayi
Wang, Cong
COMPUTER SECURITY - ESORICS 2019, PT I, 2019, 11735 : 22 - 40
[2] Subsampling the Concurrent AdaBoost Algorithm: An Efficient Approach for Large Datasets
Allende-Cid, Hector
Acuna, Diego
Allende, Hector
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2016, 2017, 10125 : 318 - 325
[3] Efficient supervised optimum-path forest classification for large datasets
Papa, Joao P.
Falcao, Alexandre X.
de Albuquerque, Victor Hugo C.
Tavares, Joao Manuel R. S.
PATTERN RECOGNITION, 2012, 45 (01) : 512 - 520
[4] Secure fuzzy retrieval protocol for multiple datasets
Zhou, Jie
Deng, Jiao
Zeng, Shengke
He, Mingxing
Liu, Xingwei
COMPUTER NETWORKS, 2024, 255
[5] Towards a Methodology for Addressing Missingness in Datasets, with an Application to Demographic Health Datasets
Khangamwa, Gift
van Zyl, Terence
van Alten, Clint J.
ARTIFICIAL INTELLIGENCE RESEARCH, SACAIR 2022, 2022, 1734 : 169 - 186
[6] Towards a Secure Peer-to-Peer Federated Learning Framework
Piotrowski, Tim
Nochta, Zoltan
ADVANCES IN SERVICE-ORIENTED AND CLOUD COMPUTING, ESOCC 2022, 2022, 1617 : 19 - 31
[7] Efficient disjointness tests for private datasets
Ye, Qingsong
Wang, Huaxiong
Pieprzyk, Josef
Zhang, Xian-Mo
INFORMATION SECURITY AND PRIVACY, 2008, 5107 : 155 - 169
[8] PARROT is a flexible recurrent neural network framework for analysis of large protein datasets
Griffith, Daniel
Holehouse, Alex S.
ELIFE, 2021, 10
[9] Impact of imbalanced features on large datasets
Albattah, Waleed
Khan, Rehan Ullah
FRONTIERS IN BIG DATA, 2025, 8
[10] Towards Secure Big Data Analysis via Fully Homomorphic Encryption Algorithms
Hamza, Rafik
Hassan, Alzubair
Ali, Awad
Bashir, Mohammed Bakri
Alqhtani, Samar M.
Tawfeeg, Tawfeeg Mohmmed
Yousif, Adil
ENTROPY, 2022, 24 (04)

← 1 2 3 4 5 →