Efficient approximation and privacy preservation algorithms for real time online evolving data streams

被引:4
作者
Patil, Rahul A. [1 ,2 ]
Patil, Pramod D. [1 ]
机构
[1] Dr D Y Patil Inst Technol, Pimpri Pune 411018, Maharashtra, India
[2] Pimpri Chinchwad Coll Engn, Pune 411044, Maharashtra, India
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2024年 / 27卷 / 01期
关键词
Approximation; Data streaming; Clustering; k-anonymization; l-diversity; Privacy preservation; ANONYMIZATION;
D O I
10.1007/s11280-024-01244-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.
引用
收藏
页数:20
相关论文
共 33 条
  • [21] Online Learning Model for Handling Different Concept Drifts Using Diverse Ensemble Classifiers on Evolving Data Streams
    Ancy, S.
    Paulraj, D.
    CYBERNETICS AND SYSTEMS, 2019, 50 (07) : 579 - 608
  • [22] Real-Time Data Streaming Algorithms and Processing Technologies: A Survey
    Navaz, Alramzana Nujum
    Harous, Saad
    Serhani, Mohamed Adel
    Taleb, Ikbal
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 246 - 250
  • [23] Real-Time Clustering for Large Sparse Online Visitor Data
    Chan, Gromit Yeuk-Yin
    Du, Fan
    Rossi, Ryan A.
    Rao, Anup B.
    Koh, Eunyee
    Silva, Claudio T.
    Freire, Juliana
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 1049 - 1059
  • [24] Differentially Private Real-Time Data Publishing over Infinite Trajectory Streams
    Cao, Yang
    Yoshikawa, Masatoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (01): : 163 - 175
  • [25] Air pollution monitoring model in real-time with cloud based channel ranging and stego privacy preservation
    Radhakrishnan, C.
    Asokan, R.
    GLOBAL NEST JOURNAL, 2024, 26 (06):
  • [26] Let's Hide from LLMs: An Adaptive Contextual Privacy Preservation Method for Time Series Data
    Rehman, Ubaid Ur
    Hussain, Musarrat
    Nguyen, Tri D. T.
    Lee, Sungyoung
    PROCEEDINGS OF 2023 6TH ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE, AICCC 2023, 2023, : 196 - 203
  • [27] Towards Efficient and Privacy-Preserving Interval Skyline Queries Over Time Series Data
    Zhang, Songnian
    Ray, Suprio
    Lu, Rongxing
    Zheng, Yandong
    Guan, Yunguo
    Shao, Jun
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (02) : 1348 - 1363
  • [28] A privacy preservation model for big data in map-reduced framework based on k-anonymisation and swarm-based algorithms
    Madan, Suman
    Goswami, Puneet
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2020, 8 (01) : 38 - 53
  • [29] EPRICE: An Efficient and Privacy-Preserving Real-Time Incentive System for Crowdsensing in Industrial Internet of Things
    Feng, Qi
    He, Debiao
    Luo, Min
    Huang, Xinyi
    Choo, Kim-Kwang Raymond
    IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (09) : 2482 - 2495
  • [30] Energy efficient wireless sensor network with efficient data handling for real time landslide monitoring system using fuzzy data mining technique
    Sumathi M.S.
    Anitha G.S.
    Sumathi, M.S. (sumathimanjari@gmail.com), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (08) : 179 - 193