PoliViews: A comprehensive and modular approach to the conceptual modeling of genomic data

被引:0
作者
Bernasconi, Anna [1 ]
Garcia, S. Alberto [2 ]
Ceri, Stefano [1 ]
Pastor, Oscar [2 ]
机构
[1] Politecn Milan, Dept Elect Informat & Bioengn, Milan, Italy
[2] Univ Politecn Valencia, VRAIN Res Inst, PROS Res Ctr, Valencia, Spain
关键词
Conceptual modeling; Data repositories; Data integration; Biological datasets; Genomics; Scientific databases; GENE-EXPRESSION; INTEGRATIVE ANALYSIS; DNA ELEMENTS; ENCYCLOPEDIA; ENVIRONMENT; ATLAS;
D O I
10.1016/j.datak.2023.102201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The human genome complexity is captured by many signals, representing for instance DNA variations, the expression of gene activity, or DNA's structural rearrangements; a rich set of data types and formats is used to record these signals. Conceptual models can support the description and explanation of the genome's elaborate structure and behavior. Among others, the Conceptual Schema of the Human Genome (CSG) provides a concept-oriented, top-down representation of the genome behavior, which is independent of data formats. The Genomic Conceptual Model (GCM) provides instead a data-oriented, bottom-up representation, targeting a well-organized, unified description of these formats. In this research, we join the two approaches to achieve PoliViews, a comprehensive model that links (1) a concepts layer, describing genome elements and their conceptual connections, with (2) a data layer, describing datasets derived from genome sequencing with specific technologies. Their dynamic connection is established when specific genomic data types are chosen in the data layer, thereby triggering the selection of a view in the concepts layer. The benefit is mutual: data records can be semantically described by high-level concepts exploiting their links and, in turn, the continuously evolving abstract model can be extended thanks to the input provided by real datasets. PoliViews enables expressing queries that employ a holistic conceptual perspective on the genome, directly translated onto data-oriented terms and organization. Here, we demonstrate the approach by linking two major genomic data types, namely DNA variation and gene expression. For each type, we consider different eminent data sources; we describe their mapping with the corresponding view in the concepts layer, enabling an intra-data-type integration. Then, leveraging on the connections available in the concepts layer, we show how the distinct data types can be interoperated, enabling an inter-data-type integration. The PoliViews approach is shown through several examples of biological interest and can be further extended to any kind of genomic information.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Modeling vague spatial data warehouses using the VSCube conceptual model
    Lopes Siqueira, Thiago Luis
    de Aguiar Ciferri, Cristina Dutra
    Times, Valeria Cesario
    Ciferri, Ricardo Rodrigues
    GEOINFORMATICA, 2014, 18 (02) : 313 - 356
  • [42] Conceptual modeling of big data SPJ operations with Twitter social medium
    Hana Mallek
    Faiza Ghozzi
    Faiez Gargouri
    Social Network Analysis and Mining, 13
  • [43] Conceptual Data Modeling: Entity-Relationship Models as Thinging Machines
    Al-Fedaghi, Sabah
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (09): : 247 - 260
  • [44] A conceptual data modeling framework with four levels of abstraction for environmental information
    Martinez, David
    Po, Laura
    Trillo-Lado, Raquel
    Viqueira, Jose R. R.
    ENVIRONMENTAL MODELLING & SOFTWARE, 2025, 183
  • [45] A comprehensive framework for modeling set-based business rules during conceptual database design
    Ram, S
    Khatri, V
    INFORMATION SYSTEMS, 2005, 30 (02) : 89 - 118
  • [46] Fractal Genomics modeling: A new approach to genomic analysis and biomarker discovery
    Shaw, S
    Shapshak, P
    2004 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2004, : 9 - 17
  • [47] OpenGDC: Unifying, Modeling, Integrating Cancer Genomic Data and Clinical Metadata
    Cappelli, Eleonora
    Cumbo, Fabio
    Bernasconi, Anna
    Canakoglu, Arif
    Ceri, Stefano
    Masseroli, Marco
    Weitschek, Emanuel
    APPLIED SCIENCES-BASEL, 2020, 10 (18):
  • [48] A Case of Papillary Thyroid Carcinoma and Kostmann Syndrome: A Genomic Theranostic Approach for Comprehensive Treatment
    Han, Soo
    Ehrhardt, John, Jr.
    Shukla, Savya
    Elkbuli, Adel
    Nikiforov, Yuri E.
    Gulec, Seza A.
    AMERICAN JOURNAL OF CASE REPORTS, 2019, 20 : 1027 - 1034
  • [49] Biomedical informatics: development of a comprehensive data warehouse for clinical and genomic breast cancer research
    Hu, H
    Brzeski, H
    Hutchins, J
    Ramaraj, M
    Qu, L
    Xiong, R
    Kalathil, S
    Kato, R
    Tenkillaya, S
    Carney, J
    Redd, R
    Arkalgudvenkata, S
    Shahzad, K
    Scott, R
    Cheng, H
    Meadow, S
    McMichael, J
    Sheu, SL
    Rosendale, D
    Kvecher, L
    Ahern, S
    Yang, S
    Zhang, YH
    Jordan, R
    Somiari, S
    Hooke, J
    Shriver, CD
    Somiari, RI
    Liebman, MN
    PHARMACOGENOMICS, 2004, 5 (07) : 933 - 941
  • [50] A CMMS-Based Formal Conceptual Modeling Approach for Team Simulation and Training
    Wang, Jian
    Wang, Hongwei
    ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 1, PROCEEDINGS, 2009, 5551 : 946 - +