Programming knowledge discovery workflows in service-oriented distributed systems

被引:8
|
作者
Cesario, Eugenio [1 ]
Lackovic, Marco [2 ]
Talia, Domenico [1 ,2 ]
Trunfio, Paolo [2 ]
机构
[1] ICAR CNR, Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, DEIS, I-87036 Arcavacata Di Rende, CS, Italy
来源
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2013年 / 25卷 / 10期
关键词
distributed data mining; workflows; Grid computing; Knowledge Grid;
D O I
10.1002/cpe.2936
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In several scientific and business domains, very large data repositories are generated. To find interesting and useful information in those repositories, efficient data mining techniques and knowledge discovery processes must be used. The exploitation of data mining techniques in science helps scientists in hypothesis formation and gives them a support on their scientific practices, whereas in industrial processes, data mining can exploit existing data sources as a real value for companies that can take advantage from the knowledge that can be extracted from their large data sources. Data mining tasks are often composed by multiple stages that may be linked to each other to form various execution flows. Moreover, data mining tasks are often distributed because they involve data and tools located over geographically distributed environments. Therefore, it is fundamental to exploit effective paradigms, such as services and workflows, to model data mining tasks that are both multi-staged and distributed. This paper discusses data mining services and workflows for analyzing scientific data in high-performance distributed environments such as Grids and Clouds. We discuss how it is possible to define basic and complex services for supporting distributed data mining tasks in Grids. We also present a workflow formalism and a service-oriented programming framework, named DIS3GNO, for designing and running distributed knowledge discovery processes in the Knowledge Grid system. DIS3GNO supports all the phases of a knowledge discovery process, including composition, execution, and results visualization. After introducing DIS3GNO, some relevant use cases implemented by it and a performance evaluation of the system are discussed. Copyright (C) 2012 John Wiley & Sons, Ltd.
引用
收藏
页码:1482 / 1504
页数:23
相关论文
共 50 条
  • [1] Service-oriented middleware for distributed data mining on the grid
    Congiusta, Antonio
    Talia, Domenico
    Trunfio, Paolo
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2008, 68 (01) : 3 - 15
  • [2] Implementing the Finite Element Analysis with Service-Oriented Architectures and Workflows
    Mladenov, Nikolay
    APPLICATIONS OF MATHEMATICS IN ENGINEERING AND ECONOMICS (AMEE'14), 2014, 1631 : 99 - 103
  • [3] The Technique of Creating Distributed Computing Systems based on Service-Oriented Architecture
    Lyashov, M., V
    Bereza, A. N.
    Babaev, A. M.
    Alekseenko, J., V
    Nazvantsev, D. S.
    2016 IEEE 10TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2016, : 320 - 324
  • [4] Abacus: A service-oriented programming language for grid applications
    Wang, XN
    Xiao, LJ
    Li, W
    Yu, HY
    Xu, ZW
    2005 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING, VOL 1, PROCEEDINGS, 2005, : 225 - 232
  • [5] Service-oriented knowledge modeling method and its application
    Ren, Yan
    Li, Jian-Jun
    Liu, Xiang
    Luo, Xue-Shan
    Chen, Hong-Hui
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1506 - +
  • [6] Web services workflows - Composition, co-ordination, and transactions in service-oriented computing
    Dustdar, S
    CONCURRENT ENGINEERING-RESEARCH AND APPLICATIONS, 2004, 12 (03): : 237 - 245
  • [7] A novel stochastic algorithm for scheduling workflows with QoS guarantees in a web service-oriented grid
    Patel, Yash
    Darlington, John
    PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 428 - +
  • [8] The Weka4WS framework for distributed data mining in service-oriented Grids
    Talia, Domenico
    Trunfio, Paolo
    Verta, Oreste
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2008, 20 (16): : 1933 - 1951
  • [9] A service-oriented Grid environment for integration of distributed kidney models and resources
    Chu, Xingchen
    Lonie, Andrew
    Harris, Peter
    Thomas, Randall
    Buyya, Rajkumar
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2008, 20 (09): : 1095 - 1111
  • [10] Enhancing Semantic E-Government Workflows Through Service Oriented Knowledge Provision
    Hrgovcic, Vedran
    Woitsch, Robert
    2009 FOURTH INTERNATIONAL CONFERENCE ON INTERNET AND WEB APPLICATIONS AND SERVICES, 2009, : 424 - 428