Anonymization of Network Traces Data through Condensation-based Differential Privacy

被引:4
作者
Aleroud, Ahmed [1 ,2 ,3 ,5 ]
Yang, Fan [2 ,4 ]
Pallaprolu, Sai Chaithanya [2 ,4 ]
Chen, Zhiyuan [2 ,4 ]
Karabatis, George [2 ,4 ]
机构
[1] Yarmouk Univ, Irbid, Jordan
[2] Univ Maryland Baltimore Cty, Baltimore, MD 21228 USA
[3] Augusta Univ, Sch Comp & Cyber Sci, 2500 Walton Way, Augusta, GA 30904 USA
[4] Univ Maryland, Dept Informat Syst, Baltimore, MD 21250 USA
[5] Augusta Univ, Augusta, GA 30912 USA
来源
DIGITAL THREATS: RESEARCH AND PRACTICE | 2021年 / 2卷 / 04期
关键词
Data Injection attacks; information security; netflow; intrusion detection; semantic link network; differential privacy; trace anonymization; K-ANONYMITY;
D O I
10.1145/3425401
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Network traces are considered a primary source of information to researchers, who use them to investigate research problems such as identifying user behavior, analyzing network hierarchy, maintaining network security, classifying packet flows, and much more. However, most organizations are reluctant to share their data with a third party or the public due to privacy concerns. Therefore, data anonymization prior to sharing becomes a convenient solution to both organizations and researchers. Although several anonymization algorithms are available, few of them allow sufficient privacy (organization need), acceptable data utility (researcher need), and efficient data analysis at the same time. This article introduces a condensation-based differential privacy anonymization approach that achieves an improved tradeoff between privacy and utility compared to existing techniques and produces anonymized network trace data that can be shared publicly without lowering its utility value. Our solution also does not incur extra computation overhead for the data analyzer. A prototype system has been implemented, and experiments have shown that the proposed approach preserves privacy and allows data analysis without revealing the original data even when injection attacks are launched against it. When anonymized datasets are given as input to graph-based intrusion detection techniques, they yield almost identical intrusion detection rates as the original datasets with only a negligible impact.
引用
收藏
页数:23
相关论文
共 47 条
[1]  
Aggarwal CC, 2004, LECT NOTES COMPUT SC, V2992, P183
[2]  
ALEROUD A, 2016, P INT S SEC VIRT INF, V33, P934, DOI DOI 10.1007/978-3-319-48472-3_59
[3]   Queryable Semantics to Detect Cyber-Attacks: A Flow-Based Detection Approach [J].
AlEroud, Ahmed F. ;
Karabatis, George .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (02) :207-223
[4]  
[Anonymous], 2000, P ACM SIGMOD INT C M
[5]  
[Anonymous], 2007, NDSS
[6]  
Brekne T., 2005, Privacy Enhancing Technologies. 5th International Workshop, PET 2005. Revised Selected Papers (Lecture Notes in Computer Science Vol. 3856), P179
[7]  
Brekne T, 2005, Proceedings of the Third IASTED International Conference on Communications and Computer Networks, P43
[8]  
Brekne T, 2006, LECT NOTES COMPUT SC, V3856, P179
[9]  
Burkhart M, 2010, ACM SIGCOMM COMP COM, V40, P6, DOI 10.1145/1672308.1672310
[10]  
Center for Applied Internet Data Analysis (CAIDA), 2015, US