A Comprehensive Approach for the Conceptual Modeling of Genomic Data

被引:10
作者
Bernasconi, Anna [1 ,2 ]
Garcia S, Alberto [1 ]
Ceri, Stefano [2 ]
Pastor, Oscar [1 ]
机构
[1] Univ Politecn Valencia, Pros Res Ctr, VRAIN Res Inst, Valencia, Spain
[2] Politecn Milan, Dept Elect Informat & Bioengn, Milan, Italy
来源
CONCEPTUAL MODELING (ER 2022) | 2022年 / 13607卷
关键词
Conceptual modeling; Biological datasets; Genomics;
D O I
10.1007/978-3-031-17995-2_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The human genome is traditionally represented as a DNA sequence of three billion base pairs. However, its intricacies are captured by many more complex signals, representing DNA variations, the expression of gene activity, or DNA's structural rearrangements; a rich set of data formats is used to represent such signals. Different conceptual models explain such elaborate structure and behavior. Among them, the Conceptual Schema of the Human Genome (CSG) provides a concept-oriented, top-down representation of the genome behavior - independent of data formats. The Genomic Conceptual Model (GCM) instead provides a data-oriented, bottom-up representation, targeting a well-organized, unified description of these formats. We hereby propose to join these two approaches to achieve a more complete vision, linking (1) a concepts layer, describing genome elements and their conceptual connections, with (2) a data layer, describing datasets derived from genome sequencing with specific technologies. The link is established when specific genomic data types are chosen in the data layer, thereby triggering the selection of a view in the concepts layer. The benefit is mutual, as data records can be semantically described by high-level concepts and exploit their links. In turn, the continuously evolving abstract model can be extended thanks to the input provided by real datasets. As a result, it will be possible to express queries that employ a holistic conceptual perspective on the genome, directly translated onto data-oriented terms and organization. The approach is here exemplified using the DNA variation data type but is applicable to all genomic information.
引用
收藏
页码:194 / 208
页数:15
相关论文
共 31 条
[1]   ISGE: A Conceptual Model-Based Method to Correctly Manage Genome Data [J].
Alberto Garcia, S. ;
Carlos Casamayo, Juan ;
Pastor, Oscar .
INTELLIGENT INFORMATION SYSTEMS, CAISE FORUM 2021, 2021, 424 :47-54
[2]  
[Anonymous], 2015, Nature, DOI [DOI 10.1038/NATURE15393, DOI 10.1038/nature15393]
[3]   Perspectives of using Cloud computing in integrative analysis of multi-omics data [J].
Augustyn, Dariusz R. ;
Wycislik, Lukasz ;
Mrozek, Dariusz .
BRIEFINGS IN FUNCTIONAL GENOMICS, 2021, 20 (04) :198-206
[4]   Human Gene-Centered Transcription Factor Networks for Enhancers and Disease Variants [J].
Bass, Juan I. Fuxman ;
Sahni, Nidhi ;
Shrestha, Shaleen ;
Garcia-Gonzalez, Aurian ;
Mori, Akihiro ;
Bhat, Numana ;
Yi, Song ;
Hill, David E. ;
Vidal, Marc ;
Walhout, Albertha J. M. .
CELL, 2015, 161 (03) :661-673
[5]   META-BASE: A Novel Architecture for Large-Scale Genomic Metadata Integration [J].
Bernasconi, Anna ;
Canakoglu, Arif ;
Masseroli, Marco ;
Ceri, Stefano .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (01) :543-557
[6]   The road towards data integration in human genomics: players, steps and interactions [J].
Bernasconi, Anna ;
Canakoglu, Arif ;
Masseroli, Marco ;
Ceri, Stefano .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (01) :30-44
[7]   Conceptual Modeling for Genomics: Building an Integrated Repository of Open Data [J].
Bernasconi, Anna ;
Ceri, Stefano ;
Campi, Alessandro ;
Masseroli, Marco .
CONCEPTUAL MODELING, ER 2017, 2017, 10650 :325-339
[8]  
Bornberg-Bauer Erich, 2002, Brief Bioinform, V3, P166, DOI 10.1093/bib/3.2.166
[9]  
Calvanese D., 2007, P 15 ITALIAN C DATAB, P324
[10]   GenoSurf: metadata driven semantic search system for integrated genomic datasets [J].
Canakoglu, Arif ;
Bernasconi, Anna ;
Colombo, Andrea ;
Masseroli, Marco ;
Ceri, Stefano .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2019,