Scalable OWL-Horst Ontology Reasoning using SPARK

被引:0
作者
Kim, Je-Min [1 ]
Park, Young-Tack [1 ]
机构
[1] Soongsil Univ, Sch Comp, SSU, Seoul, South Korea
来源
2015 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP) | 2015年
关键词
distributed computing; ontology reasoning; Hadoop; OWL Horst; SPARK;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we present an approach to perform reasoning for scalable OWL ontologies in a Hadoop-based distributed computing cluster. Rule-based reasoning is typically used for a scalable OWL-Horst reasoning; typically, the system repeatedly performs many operations involving semantic axioms for big ontology triples until no further inferred data exists. Thus, the reasoning systems suffer from performance limitations when ontology reasoning is performed via disk-based MapReduce approaches. To overcome this drawback, we propose an approach that loads triples to memory in computer nodes that are connected by SPARK a memory-based cluster computing platform and executes ontology reasoning. To implement an OWL Horst ontology reasoning system, we first define a set of algorithms such that they divide large triples into Resilient Distributed Datasets (RDDs), taking into account the patterns and interdependencies of the reasoning rules. We then load each RDD into the memory of computers composing a distributed computing cluster and subsequently perform distributed reasoning by rule execution orders. To evaluate the proposed methods, we compare it to WebPIE using the LUBM set, which is formal dataset for evaluating ontology inferences and search speeds. The proposed approach shows throughput is improved by 200% (98k/sec) as compared to WebPIE (33k/sec) using the LUBM6000 (860 million triples, 109 gigabyte).
引用
收藏
页码:79 / 86
页数:8
相关论文
共 19 条
  • [1] A Scalable Approach for Distributed Reasoning over Large-scale OWL Datasets
    Mohamed, Heba
    Fathalla, Said
    Lehmann, Jens
    Jabeen, Hajira
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2, 2021, : 51 - 60
  • [2] Extracting Satisfiability-Preserving Modules From the OWL RL Ontology for Efficient Reasoning
    Zhao, Xiaofei
    Li, Fanzhang
    Yang, Hongji
    IEEE ACCESS, 2021, 9 : 30833 - 30844
  • [3] SparkLeBLAST: Scalable Parallelization of BLAST Sequence Alignment Using Spark
    Youssef, Karim
    Feng, Wu-chun
    2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 539 - 548
  • [4] Scalable Visualization of DBpedia Ontology Using Hadoop
    Kim, Sung-min
    Park, Seong-hun
    Ha, Young-guk
    ACTIVE MEDIA TECHNOLOGY, AMT 2013, 2013, 8210 : 301 - 306
  • [5] Scalable Data Analytics Using R: Single Machines to Hadoop Spark Clusters
    Agosta, John-Mark
    GuhaThakurta, Debraj
    Horton, Robert
    Inchiosa, Mario
    Kumar, Srini
    Zhao, Mengyue
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 2115 - 2115
  • [6] Scalable and parallel sequential pattern mining using spark
    Yu, Xiao
    Li, Qing
    Liu, Jin
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (01): : 295 - 324
  • [7] Scalable and parallel sequential pattern mining using spark
    Xiao Yu
    Qing Li
    Jin Liu
    World Wide Web, 2019, 22 : 295 - 324
  • [8] Scalable visualization for DBpedia ontology analysis using Hadoop
    Park, Seong-hun
    Kim, Sung-min
    Ha, Young-guk
    SOFTWARE-PRACTICE & EXPERIENCE, 2015, 45 (08) : 1103 - 1114
  • [9] Scalable Random Sampling K-Prototypes Using Spark
    Ben HajKacem, Mohamed Aymen
    Ben N'cir, Chiheb-Eddine
    Essoussi, Nadia
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2018), 2018, 11031 : 317 - 326
  • [10] Scalable video classification using bag of visual words on Spark
    Nguyen Anh Tu
    Thien Huynh-The
    Lee, Young-Koo
    2019 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2019, : 174 - 181