Recognize corrupted data packeted while transferring data through ensemble machine learning techniques

被引:1
|
作者
Sharma, Satyajeet [1 ]
Sharma, Bhavna [1 ]
机构
[1] JECRC Univ, Dept Comp Sci & Engn, Jaipur, Rajasthan, India
关键词
Corrupt file detection; File transfer protocols; Ensemble machine learning; Data integrity; Error detection; Machine learning; AdaBoost Classifiers;
D O I
10.47974/JIOS-1420
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
In today's world, every technology is moving towards cloud storage which makes file transfer protocols a cornerstone for any platform to run smoothly. Therefore, identifying damaged files is a crucial responsibility in the area of data management and integrity. In this study, we suggest an AdaBoost-based machine learning technique for identifying damaged files. AdaBoost is an ensemble method that combines many weak classifiers into one powerful classifier. In our method, we train weak classifiers called decision stumps using a dataset that includes both damaged and healthy files. The final prediction was decided by a weighted majority vote of all the weak classifiers. We evaluated our method on a dataset generated by collecting metadata information of files and passed it to the algorithms. We used the AdaBoost approach as a base algorithm for comparison along with more established techniques like Naive Bayes, Logistic Regression, and Linear Discriminant Analysis. The results show that the AdaBoost algorithm is effective in detecting corrupted files, and it performs better than other traditional methods. Additionally, our method is computationally efficient and can be easily integrated into existing data management systems. It is expected to have a positive impact on data integrity and management in various fields such as digital forensics, cloud computing, and storage systems.
引用
收藏
页码:1459 / 1469
页数:11
相关论文
共 50 条
  • [1] Efficient Machine Learning on Edge Computing Through Data Compression Techniques
    Larrakoetxea, Nerea Gomez
    Astobiza, Joseba Eskubi
    Lopez, Iker Pastor
    Urquijo, Borja Sanz
    Barruetabena, Jon Garcia
    Rego, Agustin Zubillaga
    IEEE ACCESS, 2023, 11 : 31676 - 31685
  • [2] Accurate Data Cleansing through Model Checking and Machine Learning Techniques
    Boselli, Roberto
    Cesarini, Mirko
    Mercorio, Fabio
    Mezzanzanica, Mario
    DATA MANAGEMENT TECHNOLOGIES AND APPLICATIONS, DATA 2014, 2015, 178 : 62 - 80
  • [3] Prediction of Heart Disease using Biomedical Data through Machine Learning Techniques
    Lutimath N.M.
    Sharma N.
    Byregowda B.K.
    EAI Endorsed Transactions on Pervasive Health and Technology, 2021, 7 (29)
  • [4] Enriching administrative data using survey data and machine learning techniques
    Kunaschk, Max
    ECONOMICS LETTERS, 2024, 243
  • [5] A novel ensemble machine learning for robust microarray data classification
    Peng, Yonghong
    COMPUTERS IN BIOLOGY AND MEDICINE, 2006, 36 (06) : 553 - 573
  • [6] Fraud Detection in Banking Data by Machine Learning Techniques
    Hashemi, Seyedeh Khadijeh
    Mirtaheri, Seyedeh Leili
    Greco, Sergio
    IEEE ACCESS, 2023, 11 : 3034 - 3043
  • [7] A comparative analysis of machine learning techniques for imbalanced data
    Mrad, Ali Ben
    Lahiani, Amine
    Mefteh-Wali, Salma
    Mselmi, Nada
    ANNALS OF OPERATIONS RESEARCH, 2024,
  • [8] Machine Learning Techniques for Ophthalmic Data Processing: A Review
    Sarhan, Mhd Hasan
    Nasseri, M. Ali
    Zapp, Daniel
    Maier, Mathias
    Lohmann, Chris P.
    Navab, Nassir
    Eslami, Abouzar
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (12) : 3338 - 3350
  • [9] Personalized online ensemble machine learning with applications for dynamic data streams
    Malenica, Ivana
    Phillips, Rachael V. V.
    Chambaz, Antoine
    Hubbard, Alan E. E.
    Pirracchio, Romain
    van der Laan, Mark J. J.
    STATISTICS IN MEDICINE, 2023, 42 (07) : 1013 - 1044
  • [10] Wave data prediction with optimized machine learning and deep learning techniques
    Domala, Vamshikrishna
    Lee, Wonhee
    Kim, Tae-wan
    JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2022, 9 (03) : 1107 - 1122