Victimization (V) of Big Data: A Solution Using Federated Learning

被引:0
作者
Shivkumar, S. [1 ,2 ]
Supriya, M. [1 ,2 ]
机构
[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci Engn, Bengaluru 560036, Karnataka, India
[2] Amrita Vishwa Vidyapeetham, Amrita Sch Comp, Bengaluru 560036, Karnataka, India
来源
SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 1, SMARTCOM 2024 | 2024年 / 945卷
关键词
Victimization; Big data; Federated learning; Apache Spark; HEALTH-CARE; DATA ANALYTICS; DATA-SECURITY; PRIVACY;
D O I
10.1007/978-981-97-1320-2_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid emergence of big data has revolutionized the way organizations perceive and utilize information. With its unparalleled ability to process vast volumes of data at high speeds and handle diverse data types, big data is reshaping industries and enabling evidence-based decision-making. However, the proliferation of big data presents significant privacy challenges. The extensive collection, aggregation, and analysis of diverse datasets can inadvertently expose sensitive personal information, leading to potential breaches of individual privacy. In this work, a new "V" called victimization is introduced as a characteristic of big data. This issue can lead to hazardous consequences. To address the vulnerabilities due to this characteristic, a federated learning approach is proposed as a solution. The proposed approach was tested on two datasets in the domain of health care. The model was also trained using the conventional deep learning approach and Pyspark. The findings in our research suggest that the federated learning approach helps in overcoming those issues leading to victimization without compromising the performance of the model.
引用
收藏
页码:171 / 182
页数:12
相关论文
共 39 条
  • [1] Big data security and privacy in healthcare: A Review
    Abouelmehdi, Karim
    Beni-Hssane, Abderrahim
    Khaloufi, Hayat
    Saadi, Mostafa
    [J]. 8TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN 2017) / 7TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2017) / AFFILIATED WORKSHOPS, 2017, 113 : 73 - 80
  • [2] A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
    Ahmed, N.
    Barczak, Andre L. C.
    Susnjak, Teo
    Rashid, Mohammed A.
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [3] Ahtesham Maida, 2022, Digital Technologies and Applications: Proceedings of ICDTA' 22. Lecture Notes in Networks and Systems (454), P169, DOI 10.1007/978-3-031-01942-5_17
  • [4] Road Traffic Event Detection Using Twitter Data, Machine Learning, and Apache Spark
    Alomari, Ebtesam
    Mehmood, Rashid
    Katib, Iyad
    [J]. 2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 1888 - 1895
  • [5] [Anonymous], Big data and machine learning algorithms for health-care delivery, DOI [10.1016/S1470-2045(19)30149-4, DOI 10.1016/S1470-2045(19)30149-4]
  • [6] Anusha K., 2021, Proceedings of 5th International Conference on Computing Methodologies and Communication (ICCMC 2021), P1831, DOI 10.1109/ICCMC51019.2021.9418441
  • [7] Apache Spark and MLlib-Based Intrusion Detection System or How the Big Data Technologies Can Secure the Data
    Azeroual, Otmane
    Nikiforova, Anastasija
    [J]. INFORMATION, 2022, 13 (02)
  • [8] Processing Big Data with Apache Hadoop in the Current Challenging Era of COVID-19
    Azeroual, Otmane
    Fabre, Renaud
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (01)
  • [9] The use of Big Data Analytics in healthcare
    Batko, Kornelia
    Slezak, Andrzej
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [10] Chandralekha M, 2017, Int J Adv Res Comput Sci, V9, DOI [10.26483/ijarcs.v8i9.4937, DOI 10.26483/IJARCS.V8I9.4937]