Framework for the Assessment of Data Masking Performance Penalties in SQL Database Servers. Case Study: Oracle

被引:4
作者
Fotache, Marin [1 ]
Munteanu, Adrian [1 ,2 ]
Strimbei, Catalin [1 ]
Hrubaru, Ionut [1 ,3 ]
机构
[1] Alexandru Ioan Cuza Univ, Business Informat Syst Dept, Iasi 700506, Romania
[2] KTB InfoNet, Iasi 700506, Romania
[3] Optymyze, Iasi 700506, Romania
关键词
Data privacy; Databases; Identification of persons; Information integrity; Information filtering; Servers; Machine learning; Query processing; Masking threshold; Data masking; databases; machine learning; privacy; query execution time; security; SQL; PRIVACY; REGRESSION;
D O I
10.1109/ACCESS.2023.3247486
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic data masking (DDM) is a powerful data-security technique for protecting personal and other sensitive information in databases from unauthorized access. A DDM can be used to mask or obfuscate information in real time, as it is accessed by unauthorized users. This prevents sensitive information from being exposed, while still allowing authorized users to access the data. In current multilayered applications, data masking may be incorporated as special modules placed anywhere between the storage and user interface. In this paper, we consider the solution of implementing masking directly in the persistence layer so that data do not travel unmasked along the network. The data at rest are unchanged (i.e., unmasked), but when users query the database, the sensitive columns in the results are displayed in a masked format, which makes it impossible to identify the original data. Given the diversity of masking features proposed by commercial and open-source database servers, this study proposes a framework for assessing the performance penalty of SQL queries when using database masking relative to the original (unmasking) scenario. We implemented and applied the framework to a basic masking scenario in the Oracle database server using the TPC-H benchmark database. Exploratory analysis and Machine Learning models suggest that DDM has a weak impact on query performance. This could be a powerful incentive for incorporating DDM in real-world software applications when up to 100GB data is stored using Oracle database server.
引用
收藏
页码:18520 / 18541
页数:22
相关论文
共 90 条
[61]   Hyperparameters and tuning strategies for random forest [J].
Probst, Philipp ;
Wright, Marvin N. ;
Boulesteix, Anne-Laure .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 9 (03)
[62]  
Probst P, 2019, J MACH LEARN RES, V20
[63]   Privacy-Preserving Linear Regression on Distributed Data by Homomorphic Encryption and Data Masking [J].
Qiu, Guowei ;
Gui, Xiaolin ;
Zhao, Yingliang .
IEEE ACCESS, 2020, 8 :107601-107613
[64]   Data masking: A new approach for steganography? [J].
Radhakrishnan, R ;
Kharrazi, M ;
Memon, N .
JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2005, 41 (03) :293-303
[65]  
Radhakrishnan R, 2002, PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, P339
[66]  
Santos R. J., 2012, DATA WAREHOUSING KNO, V7448, DOI [10.1007/978-3-642-32584-7_33, DOI 10.1007/978-3-642-32584-7_33]
[67]  
Santos RJ, 2011, PROCEEDINGS OF THE 15TH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM (IDEAS '11), P61
[68]   Balancing Security and Performance for Enhancing Data Privacy in Data Warehouses [J].
Santos, Ricardo Jorge ;
Bernardino, Jorge ;
Vieira, Marco .
TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, :242-249
[69]  
Sarada G, 2015, 2015 INTERNATIONAL CONFERENCED ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2015)
[70]  
Saunders M., 2012, RES METHODS BUSINESS, DOI DOI 10.1007/S13398-014-0173-7.2