Semantic Annotation of Data Processing Pipelines in Scientific Publications

被引:11
作者
Mesbah, Sepideh [1 ]
Fragkeskos, Kyriakos [1 ]
Lofi, Christoph [1 ]
Bozzon, Alessandro [1 ]
Houben, Geert-Jan [1 ]
机构
[1] Delft Univ Technol, Mekelweg 4, NL-2628 CD Delft, Netherlands
来源
SEMANTIC WEB ( ESWC 2017), PT I | 2017年 / 10249卷
关键词
D O I
10.1007/978-3-319-58068-5_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data processing pipelines are a core object of interest for data scientist and practitioners operating in a variety of data-related application domains. To effectively capitalise on the experience gained in the creation and adoption of such pipelines, the need arises for mechanisms able to capture knowledge about datasets of interest, data processing methods designed to achieve a given goal, and the performance achieved when applying such methods to the considered datasets. However, due to its distributed and often unstructured nature, this knowledge is not easily accessible. In this paper, we use (scientific) publications as source of knowledge about Data Processing Pipelines. We describe a method designed to classify sentences according to the nature of the contained information (i.e. scientific objective, dataset, method, software, result), and to extract relevant named entities. The extracted information is then semantically annotated and published as linked data in open knowledge repositories according to the DMS ontology for data processing metadata. To demonstrate the effectiveness and performance of our approach, we present the results of a quantitative and qualitative analysis performed on four different conference series.
引用
收藏
页码:321 / 336
页数:16
相关论文
共 50 条
  • [21] SEMANTIC ANNOTATION OF AQUACULTURE PRODUCTION DATA
    Amaral, Pedro
    Oliveira, Pedro
    Moutinho, Marcio
    Matado, Daniel
    Costa, Ruben
    Sarraipa, Joao
    PROCEEDINGS OF THE ASME INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION, 2016, VOL. 2, 2016,
  • [22] Concept Extraction Based on Semantic Models Using Big Amount of Patents and Scientific Publications Data
    Kaliteevskii, Vasilii
    Deder, Arthur
    Peric, Nemanja
    Chechurin, Leonid
    CREATIVE SOLUTIONS FOR A SUSTAINABLE DEVELOPMENT (TFC 2021), 2021, 635 : 141 - 149
  • [23] WORD-PROCESSING SYSTEMS AND SCIENTIFIC PUBLICATIONS
    DESBARRES, J
    ANALUSIS, 1984, 12 (06) : 285 - 289
  • [24] The need for scientific data annotation.
    Weintraub, HJR
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2003, 226 : U303 - U304
  • [25] Semantic Annotation of Data in Schemas to Support Data Translations
    Moutinho, Filipe
    Paiva, Luis
    Malo, Pedro
    Gomes, Luis
    PROCEEDINGS OF THE IECON 2016 - 42ND ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2016, : 5283 - 5288
  • [26] Reporting microhardness data in scientific publications
    Pober, R
    JOURNAL OF DENTAL RESEARCH, 1998, 77 (10) : 1766 - 1766
  • [27] The use of semantic similarity measures for optimally integrating heterogeneous Gene Ontology data from large scale annotation pipelines
    Mazandu, Gaston K.
    Mulder, Nicola J.
    FRONTIERS IN GENETICS, 2014, 5
  • [28] Data mart construction based on semantic annotation of scientific articles: A case study for the prioritization of drug targets
    Coelho Teixeira, Marlon Amaro
    Belloze, Kele Teixeira
    Cavalcanti, Maria Claudia
    Silva-Junior, Floriano P.
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2018, 157 : 225 - 235
  • [29] Mining web data for image semantic annotation
    Basili, Roberto
    Petitti, Riccardo
    Saracino, Dario
    AI(ASTERISK)IA 2007: ARTIFICIAL INTELLIGENCE AND HUMAN-ORIENTED COMPUTING, 2007, 4733 : 674 - +
  • [30] Semantic trajectories: Mobility data computation and annotation
    Yan, Z. (zhixian.yan@epfl.ch), 1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (43):