A SEMANTIC SCRAPING MODEL FOR WEB RESOURCES Applying Linked Data to Web Page Screen Scraping

被引:0
|
作者
Ignacio Fernandez-Villamor, Jose [1 ]
Blasco-Garcia, Jacobo [1 ]
Iglesias, Carlos A. [1 ]
Garijo, Mercedes [1 ]
机构
[1] Univ Politecn Madrid, Dept Ingn Sistemas Telemat, Madrid, Spain
关键词
Information extraction; Linked data; Screen scraping;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In spite of the increasing presence of Semantic Web Facilities, only a limited amount of the available resources in the Internet provide a semantic access. Recent initiatives such as the emerging Linked Data Web are providing semantic access to available data by porting existing resources to the semantic web using different technologies, such as database-semantic mapping and scraping. Nevertheless, existing scraping solutions are based on ad-hoc solutions complemented with graphical interfaces for speeding up the scraper development. This article proposes a generic framework for web scraping based on semantic technologies. This framework is structured in three levels: scraping services, semantic scraping model and syntactic scraping. The first level provides an interface to generic applications or intelligent agents for gathering information from the web at a high level. The second level defines a semantic RDF model of the scraping process, in order to provide a declarative approach to the scraping task. Finally, the third level provides an implementation of the RDF scraping model for specific technologies. The work has been validated in a scenario that illustrates its application to mashup technologies.
引用
收藏
页码:451 / 456
页数:6
相关论文
共 50 条
  • [21] A Framework for Automated Scraping of Structured Data Records From the Deep Web Using Semantic Labeling: Semantic Scraper
    Kumaresan, Umamageswari
    Ramanujam, Kalpana
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2022, 12 (01)
  • [22] Web Scraping Techniques to Collect Weather Data in South Sumatera
    Fatmasari
    Kunang, Yesi Novaria
    Purnamasari, Susan Dian
    2018 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICECOS), 2018, : 385 - 389
  • [23] Web Scraping in the Statistics and Data Science Curriculum: Challenges and Opportunities
    Dogucu, Mine
    Cetinkaya-Rundel, Mine
    JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION, 2021, 29 : S112 - S122
  • [24] Measuring Technology Platforms impact with search data and web scraping
    Blazquez, Desamparados
    Domenech, Josep
    Garcia-Alvarez-Coque, Jose-Maria
    2ND INTERNATIONAL CONFERENCE ON ADVANCED RESEARCH METHODS AND ANALYTICS (CARMA 2018), 2018, : 259 - 259
  • [25] Collecting data on textiles from the internet using web crawling and web scraping tools
    Muehlethaler, Cyril
    Albert, Rene
    FORENSIC SCIENCE INTERNATIONAL, 2021, 322
  • [26] Recovery Resources for College Students: Leveraging Web Scraping to Unveil Current Estimates
    Bell, Justin S.
    Nieder, Alexa
    Shore, Chelsea
    Blankenship, Aaron
    Dolgoff, Erik
    Gibson, Micheal
    Alnashri, Yahya
    Markham, Benjamin
    Murphy, Declan
    Singer, Adam
    Vest, Noel
    PSYCHOLOGY OF ADDICTIVE BEHAVIORS, 2024, 38 (08) : 911 - 916
  • [27] A CONTEMPORARY RESEARCH STUDY ON WEB SCRAPING AND INNOVATION
    Roth, Katherine
    Farahmand, Kambiz
    Al-Amin, Md
    Mahmoud, Mohammed
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 971 - 977
  • [28] The Use of Web Scraping to Explain Donation Behavior
    Ploder, Christian
    Spiess, Johannes
    Schloegl, Stephan
    Dilger, Thomas
    Bernsteiner, Reinhard
    Gander, Markus
    KNOWLEDGE MANAGEMENT IN ORGANISATIONS, KMO 2024, 2024, 2152 : 394 - 403
  • [29] Novel Scratch Programming Blocks for Web Scraping
    Park, Youngki
    Shin, Youhyun
    ELECTRONICS, 2022, 11 (16)
  • [30] Web Scraping Tool For Newspapers And Images Data Using Json']Jsonify
    Niu, Qingli
    Kandhro, Irfan Ali
    Kumar, Anil
    Shah, Shahnawaz
    Hasan, Muhammad
    Ahmed, Mehfooz
    Liang, Fei
    JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2023, 26 (04): : 465 - 474