A survey of distributed classification based ensemble data mining methods

被引:4
作者
Mokeddem, D. [1 ]
Belbachir, H. [1 ]
机构
[1] Laboratory of Signal, Systems and Databases LSSD, Department of Computer Sciences, University of Sciences and Technology Mohamed Boudiaf, El Mnaouer, Oran
关键词
Decision trees algorithm; Distributed data mining; Ensemble learning methods;
D O I
10.3923/jas.2009.3739.3745
中图分类号
学科分类号
摘要
Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to increase the overall efficiency; the second is the processing of data which are inherently distributed and autonomous. Ensemble learning methods as very promising techniques in terms of accuracy and also providing a distributed aspect, can be adapted to the distributed data mining. This study presents a survey of various approaches which use ensemble learning methods in a context of distributed classification, using as base classifier decision trees algorithm. According to the two objective mentioned above, the majority of work reported in the literature address the problem using one of the two techniques. The adaptation of ensemble learning methods to disjoint data sets, in the context of mining inherently distributed data and the parallelization of ensemble learning methods, in a scalability context. Through this survey, one can deduct that the work done in one or the other perspective (scaling up data mining algorithms or mining inherently distributed data) could be complementary. Some open questions, current debates and future directions are also discussed. © 2009 Asian Network for Scientific Information.
引用
收藏
页码:3739 / 3745
页数:6
相关论文
共 50 条
  • [21] Deploying mobile agents in distributed data mining
    Li, Xining
    Ni, Jinglo
    EMERGING TECHNOLOGIES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2007, 4819 : 322 - +
  • [22] Distributed data mining for e-business
    Bin Liu
    Shu Gui Cao
    Wu He
    Information Technology and Management, 2011, 12 : 67 - 79
  • [23] Distributed data mining in grid computing environment
    Xue, Huifang
    AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (01): : 2719 - 2723
  • [24] Distributed Execution Environment for Data Mining as Service
    Kholod, Ivan
    Borisenko, Konstantin
    PROCEEDINGS OF THE 2016 IEEE NORTH WEST RUSSIA SECTION YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING CONFERENCE (ELCONRUSNW), 2016, : 236 - 241
  • [25] DXCS: an XCS system for distributed data mining
    Dam, Hai H.
    Abbass, Hussein A.
    Lokan, Chris
    GECCO 2005: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOLS 1 AND 2, 2005, : 1883 - 1890
  • [26] Distributed data mining services leveraging WSRF
    Congiusta, Antonio
    Talia, Domenico
    Trunfio, Paolo
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING THEORY METHODS AND APPLICATIONS, 2007, 23 (01): : 34 - 41
  • [27] Distributed data mining for e-business
    Liu, Bin
    Cao, Shu Gui
    He, Wu
    INFORMATION TECHNOLOGY & MANAGEMENT, 2011, 12 (02) : 67 - 79
  • [28] A communication efficient and scalable distributed data mining for the astronomical data
    Govada, A.
    Sahay, S. K.
    ASTRONOMY AND COMPUTING, 2016, 16 : 166 - 173
  • [29] Using distributed data mining and distributed artificial intelligence for knowledge integration
    de Paula, Ana C. M. P.
    Avila, Braulio C.
    Scalabrin, Edson
    Enembreck, Fabricio
    COOPERATIVE INFORMATION AGENTS XI, PROCEEDINGS, 2007, 4676 : 89 - +
  • [30] Improving diagnostic accuracy using agent-based distributed data mining system
    Sridhar, S.
    INFORMATICS FOR HEALTH & SOCIAL CARE, 2013, 38 (03) : 182 - 195