An overview of recent distributed algorithms for learning fuzzy models in Big Data classification

被引:15
作者
Ducange, Pietro [1 ]
Fazzolari, Michela [2 ]
Marcelloni, Francesco [1 ]
机构
[1] Dipartimento Ingn Informaz, Largo Lucio Lazzarino 1, I-56122 Pisa, Italy
[2] CNR, Ist Informat & Telemat, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy
关键词
Big Data; Fuzzy models; Data mining; Classification algorithms; Distributed computing; MULTIOBJECTIVE EVOLUTIONARY APPROACH; ASSOCIATIVE CLASSIFICATION; CLUSTERING-ALGORITHM; SYSTEMS; MAPREDUCE; ANALYTICS; DESIGN; GRANULARITY; CLASSIFIERS; SELECTION;
D O I
10.1186/s40537-020-00298-6
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, a huge amount of data are generated, often in very short time intervals and in various formats, by a number of different heterogeneous sources such as social networks and media, mobile devices, internet transactions, networked devices and sensors. These data, identified as Big Data in the literature, are characterized by the popular Vs features, such as Value, Veracity, Variety, Velocity and Volume. In particular, Value focuses on the useful knowledge that may be mined from data. Thus, in the last years, a number of data mining and machine learning algorithms have been proposed to extract knowledge from Big Data. These algorithms have been generally implemented by using ad-hoc programming paradigms, such as MapReduce, on specific distributed computing frameworks, such as Apache Hadoop and Apache Spark. In the context of Big Data, fuzzy models are currently playing a significant role, thanks to their capability of handling vague and imprecise data and their innate characteristic to be interpretable. In this work, we give an overview of the most recent distributed learning algorithms for generating fuzzy classification models for Big Data. In particular, we first show some design and implementation details of these learning algorithms. Thereafter, we compare them in terms of accuracy and interpretability. Finally, we argue about their scalability.
引用
收藏
页数:29
相关论文
共 50 条
[31]   A MapReduce Cortical Algorithms Implementation for Unsupervised Learning of Big Data [J].
Hajj, Nadine ;
Rizk, Yara ;
Awad, Mariette .
INNS CONFERENCE ON BIG DATA 2015 PROGRAM, 2015, 53 :327-334
[32]   On the Distributed Implementation of Unsupervised Extreme Learning Machines for Big Data [J].
Rizk, Yara ;
Awad, Mariette .
INNS CONFERENCE ON BIG DATA 2015 PROGRAM, 2015, 53 :167-174
[33]   Distributed dictionary learning for industrial process monitoring with big data [J].
Huang, Keke ;
Wei, Ke ;
Li, Yonggang ;
Yang, Chunhua .
APPLIED INTELLIGENCE, 2021, 51 (11) :7718-7734
[34]   A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules [J].
del Rio, Sara ;
Lopez, Victoria ;
Manuel Benitez, Jose ;
Herrera, Francisco .
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2015, 8 (03) :422-437
[35]   A bibliometric analysis and cutting-edge overview on fuzzy techniques in Big Data [J].
Shukla, Amit K. ;
Muhuri, Pranab K. ;
Abraham, Ajith .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 92 (92)
[36]   Binary imbalanced big data classification based on fuzzy data reduction and classifier fusion [J].
Zhai, Junhai ;
Wang, Mohan ;
Zhang, Sufang .
SOFT COMPUTING, 2022, 26 (06) :2781-2792
[37]   Imbalanced Big Data Classification: A Distributed Implementation of SMOTE [J].
Rastogi, Avnish Kumar ;
Narang, Nitin ;
Siddiqui, Zamir Ahmad .
PROCEEDINGS OF THE WORKSHOP PROGRAM OF THE 19TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING (ICDCN'18), 2018,
[38]   Distributed parallel deep learning of Hierarchical Extreme Learning Machine for multimode quality prediction with big process data [J].
Yao, Le ;
Ge, Zhiqiang .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 81 :450-465
[39]   Big data analytics and machine learning: A retrospective overview and bibliometric analysis [J].
Zhang, Justin Zuopeng ;
Srivastava, Praveen Ranjan ;
Sharma, Dheeraj ;
Eachempati, Prajwal .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
[40]   An Overview of Fashion Business Models in Big Data Environment [J].
Wang, Yun-Yun ;
Li, Yi ;
Perry, Patsy ;
Liu, Zhang-Chi .
TEXTILE BIOENGINEERING AND INFORMATICS SYMPOSIUM (TBIS) PROCEEDINGS, 2018, 2018, :690-700