An overview of recent distributed algorithms for learning fuzzy models in Big Data classification

被引:15
|
作者
Ducange, Pietro [1 ]
Fazzolari, Michela [2 ]
Marcelloni, Francesco [1 ]
机构
[1] Dipartimento Ingn Informaz, Largo Lucio Lazzarino 1, I-56122 Pisa, Italy
[2] CNR, Ist Informat & Telemat, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy
关键词
Big Data; Fuzzy models; Data mining; Classification algorithms; Distributed computing; MULTIOBJECTIVE EVOLUTIONARY APPROACH; ASSOCIATIVE CLASSIFICATION; CLUSTERING-ALGORITHM; SYSTEMS; MAPREDUCE; ANALYTICS; DESIGN; GRANULARITY; CLASSIFIERS; SELECTION;
D O I
10.1186/s40537-020-00298-6
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, a huge amount of data are generated, often in very short time intervals and in various formats, by a number of different heterogeneous sources such as social networks and media, mobile devices, internet transactions, networked devices and sensors. These data, identified as Big Data in the literature, are characterized by the popular Vs features, such as Value, Veracity, Variety, Velocity and Volume. In particular, Value focuses on the useful knowledge that may be mined from data. Thus, in the last years, a number of data mining and machine learning algorithms have been proposed to extract knowledge from Big Data. These algorithms have been generally implemented by using ad-hoc programming paradigms, such as MapReduce, on specific distributed computing frameworks, such as Apache Hadoop and Apache Spark. In the context of Big Data, fuzzy models are currently playing a significant role, thanks to their capability of handling vague and imprecise data and their innate characteristic to be interpretable. In this work, we give an overview of the most recent distributed learning algorithms for generating fuzzy classification models for Big Data. In particular, we first show some design and implementation details of these learning algorithms. Thereafter, we compare them in terms of accuracy and interpretability. Finally, we argue about their scalability.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] An overview of recent distributed algorithms for learning fuzzy models in Big Data classification
    Pietro Ducange
    Michela Fazzolari
    Francesco Marcelloni
    Journal of Big Data, 7
  • [2] Fuzzy Models for Big Data Mining
    Ducange, Pietro
    FUZZY LOGIC AND APPLICATIONS, WILF 2018, 2019, 11291 : 257 - 260
  • [3] Reliable Distributed Fuzzy Discretizer for Associative Classification of Big Data
    Pushparani, Hepzi Jeya
    Goldena, Nancy Jasmine
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2022, 12 (01)
  • [4] Models and algorithms for classifying big data based on distributed data streams
    Mao G.-J.
    Hu D.-J.
    Xie S.-Y.
    1600, Science Press (40): : 161 - 175
  • [5] CFM-BD: A Distributed Rule Induction Algorithm for Building Compact Fuzzy Models in Big Data Classification Problems
    Elkano, Mikel
    Antonio Sanz, Jose
    Barrenechea, Edurne
    Bustince, Humberto
    Galar, Mikel
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (01) : 163 - 177
  • [6] A Distributed Fuzzy Associative Classifier for Big Data
    Segatori, Armando
    Bechini, Alessio
    Ducange, Pietro
    Marcelloni, Francesco
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (09) : 2656 - 2669
  • [7] A STUDY ON THE ERROR OF DISTRIBUTED ALGORITHMS FOR BIG DATA CLASSIFICATION WITH SVM
    Wang, Cheng
    Cao, Feilong
    ANZIAM JOURNAL, 2017, 58 (3-4) : 231 - 237
  • [8] Dynamic Distributed and Parallel Machine Learning algorithms for big data mining processing
    Djafri, Laouni
    DATA TECHNOLOGIES AND APPLICATIONS, 2022, 56 (04) : 558 - 601
  • [9] Big Data Image Classification Based on Distributed Deep Representation Learning Model
    Zhu, Minjun
    Chen, Qinghua
    IEEE ACCESS, 2020, 8 : 133890 - 133904
  • [10] On Distributed Fuzzy Decision Trees for Big Data
    Segatori, Armando
    Marcelloni, Francesco
    Pedrycz, Witold
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2018, 26 (01) : 174 - 192