An overview of recent distributed algorithms for learning fuzzy models in Big Data classification

被引:15
作者
Ducange, Pietro [1 ]
Fazzolari, Michela [2 ]
Marcelloni, Francesco [1 ]
机构
[1] Dipartimento Ingn Informaz, Largo Lucio Lazzarino 1, I-56122 Pisa, Italy
[2] CNR, Ist Informat & Telemat, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy
关键词
Big Data; Fuzzy models; Data mining; Classification algorithms; Distributed computing; MULTIOBJECTIVE EVOLUTIONARY APPROACH; ASSOCIATIVE CLASSIFICATION; CLUSTERING-ALGORITHM; SYSTEMS; MAPREDUCE; ANALYTICS; DESIGN; GRANULARITY; CLASSIFIERS; SELECTION;
D O I
10.1186/s40537-020-00298-6
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, a huge amount of data are generated, often in very short time intervals and in various formats, by a number of different heterogeneous sources such as social networks and media, mobile devices, internet transactions, networked devices and sensors. These data, identified as Big Data in the literature, are characterized by the popular Vs features, such as Value, Veracity, Variety, Velocity and Volume. In particular, Value focuses on the useful knowledge that may be mined from data. Thus, in the last years, a number of data mining and machine learning algorithms have been proposed to extract knowledge from Big Data. These algorithms have been generally implemented by using ad-hoc programming paradigms, such as MapReduce, on specific distributed computing frameworks, such as Apache Hadoop and Apache Spark. In the context of Big Data, fuzzy models are currently playing a significant role, thanks to their capability of handling vague and imprecise data and their innate characteristic to be interpretable. In this work, we give an overview of the most recent distributed learning algorithms for generating fuzzy classification models for Big Data. In particular, we first show some design and implementation details of these learning algorithms. Thereafter, we compare them in terms of accuracy and interpretability. Finally, we argue about their scalability.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH
    Ayma, V. A.
    Ferreira, R. S.
    Happ, P.
    Oliveira, D.
    Feitosaa, R.
    Costa, G.
    Plaza, A.
    Gamba, P.
    PIA15+HRIGI15 - JOINT ISPRS CONFERENCE, VOL. I, 2015, 40-3 (W2): : 17 - 21
  • [22] A SURVEY OF MACHINE LEARNING ALGORITHMS FOR BIG DATA ANALYTICS
    Athmaja, S.
    Hanumanthappa, M.
    Kavitha, Vasantha
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [23] A scalable and distributed dendritic cell algorithm for big data classification
    Dagdia, Zaineb Chelly
    SWARM AND EVOLUTIONARY COMPUTATION, 2019, 50
  • [24] Analysis of Bayesian optimization algorithms for big data classification based on Map Reduce framework
    Banchhor, Chitrakant
    Srinivasu, N.
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [25] An Integration of Extreme Learning Machine for Classification of Big Data
    Zhou, Guanwu
    Zhao, Yulong
    Xu, Wenju
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND COMPUTER APPLICATIONS (ICSA 2013), 2013, 92 : 81 - 86
  • [26] A Solution for Mining Big Data Based on Distributed Data Streams and Its Classifying Algorithms
    Mao, Guojun
    Qiao, Jiewei
    DATA MINING AND BIG DATA, DMBD 2017, 2017, 10387 : 263 - 271
  • [27] Distributed Fuzzy Rough Set for Big Data Analysis in Cloud Computing
    Qu, Wenhao
    Kong, Linghe
    Wu, Kaishun
    Tang, Feilong
    Chen, Guihai
    2019 IEEE 25TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2019, : 109 - 116
  • [28] Comprehensive Analysis of Various Big Data Classification Techniques: A Challenging Overview
    Abdalla, Hemn Barzan
    Abuhaija, Belal
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2023, 22 (01)
  • [29] Distributed Big Data Clustering using MapReduce-based Fuzzy C-Medoids
    Sardar T.H.
    Ansari Z.
    Journal of The Institution of Engineers (India): Series B, 2022, 103 (01) : 73 - 82
  • [30] Granular Aggregation of Fuzzy Rule-Based Models in Distributed Data Environment
    Zhang, Bowen
    Pedrycz, Witold
    Fayek, Aminah Robinson
    Gacek, Adam
    Dong, Yucheng
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2021, 29 (05) : 1297 - 1310