A Comparative Analysis of Biological Data Integration Systems Famous for Data Exploitation and Knowledge Discovery

被引:1
作者
Irshad, Omer [1 ]
Khan, Muhammad Usman Ghani [1 ]
机构
[1] Univ Engn & Technol, Fac Elect Engn, Dept Comp Sci & Engn, POB 54890, Lahore, Pakistan
关键词
Data analysis; data integration; multi-omics; data schema characteristics; data heterogeneity; data warehouse; syn-tax-based data schema; semantic data schema; BIOINFORMATICS WEB SERVICES; SEMANTIC-WEB; RETRIEVAL-SYSTEM; ONTOLOGY; ACCESS; GENENAMES.ORG; RESOURCES; FRAMEWORK; GENOMES; REGIONS;
D O I
10.2174/1574893615999210101125442
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Integrating heterogeneous biological databases for unveiling the new intra-molecular and inter-molecular attributes, behaviors, and relationships in the human cellular system has always been a focused research area of computational biology. In this context, a lot of biological data integration systems have been deployed in the last couple of decades. One of the prime and common objectives of all these systems is to better facilitate the end-users for exploring, exploiting, and analyzing the integrated biological data for knowledge extraction. With the advent of especially high-throughput data generation technologies, biological data is growing and dispersing continuously, exponentially, heterogeneously, and geographically. Due to this, biological data integration systems face data integration and data organization-related current and future challenges. The objective of this review is to quantitatively evaluate and compare some of the recent warehouse-based multi-omics data integration systems to check their compliance with the current and future data integration needs. For this, we identified some of the major data integration design characteristics that should be in the multi-omics data integration model to comprehensively address the current and future data integration challenges. Based on these design characteristics and the evaluation criteria, we evaluated some of the recent data warehouse systems and showed categorical and comparative analysis results. Results show that most of the systems exhibit no or partial compliance with the required data integration design characteristics. So, these systems need design improvements to adequately address the current and future data integration challenges while keeping their service level commitments in place.
引用
收藏
页码:662 / 681
页数:20
相关论文
共 129 条
[31]  
Davidson S. B., 1997, International Journal on Digital Libraries, V1, P36
[32]   K2/Kleisli and GUS: Experiments in integrated access to genomic data sources [J].
Davidson, SB ;
Crabtree, J ;
Brunk, BP ;
Schug, J ;
Tannen, V ;
Overton, GC ;
Stoeckert, CJ .
IBM SYSTEMS JOURNAL, 2001, 40 (02) :512-531
[33]   A new reference implementation of the PSICQUIC web service [J].
del-Toro, Noemi ;
Dumousseau, Marine ;
Orchard, Sandra ;
Jimenez, Rafael C. ;
Galeota, Eugenia ;
Launay, Guillaume ;
Goll, Johannes ;
Breuer, Karin ;
Ono, Keiichiro ;
Salwinski, Lukasz ;
Hermjakob, Henning .
NUCLEIC ACIDS RESEARCH, 2013, 41 (W1) :W601-W606
[34]   A method for computing the overall statistical significance of a treatment effect among a group of genes [J].
Delongchamp, Robert ;
Lee, Taewon ;
Velasco, Cruz .
BMC BIOINFORMATICS, 2006, 7 (Suppl 2)
[35]   The BioPAX community standard for pathway data sharing [J].
Demir, Emek ;
Cary, Michael P. ;
Paley, Suzanne ;
Fukuda, Ken ;
Lemer, Christian ;
Vastrik, Imre ;
Wu, Guanming ;
D'Eustachio, Peter ;
Schaefer, Carl ;
Luciano, Joanne ;
Schacherer, Frank ;
Martinez-Flores, Irma ;
Hu, Zhenjun ;
Jimenez-Jacinto, Veronica ;
Joshi-Tope, Geeta ;
Kandasamy, Kumaran ;
Lopez-Fuentes, Alejandra C. ;
Mi, Huaiyu ;
Pichler, Elgar ;
Rodchenkov, Igor ;
Splendiani, Andrea ;
Tkachev, Sasha ;
Zucker, Jeremy ;
Gopinath, Gopal ;
Rajasimha, Harsha ;
Ramakrishnan, Ranjani ;
Shah, Imran ;
Syed, Mustafa ;
Anwar, Nadia ;
Babur, Oezguen ;
Blinov, Michael ;
Brauner, Erik ;
Corwin, Dan ;
Donaldson, Sylva ;
Gibbons, Frank ;
Goldberg, Robert ;
Hornbeck, Peter ;
Luna, Augustin ;
Murray-Rust, Peter ;
Neumann, Eric ;
Reubenacker, Oliver ;
Samwald, Matthias ;
van Iersel, Martijn ;
Wimalaratne, Sarala ;
Allen, Keith ;
Braun, Burk ;
Whirl-Carrillo, Michelle ;
Cheung, Kei-Hoi ;
Dahlquist, Kam ;
Finney, Andrew .
NATURE BIOTECHNOLOGY, 2010, 28 (09) :935-942
[36]   The Distributed Annotation System [J].
Dowell, Robin D. ;
Jokerst, Rodney M. ;
Day, Allen ;
Eddy, Sean R. ;
Stein, Lincoln .
BMC BIOINFORMATICS, 2001, 2 (1)
[37]   The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery [J].
Dumontier, Michel ;
Baker, Christopher J. O. ;
Baran, Joachim ;
Callahan, Alison ;
Chepelev, Leonid ;
Cruz-Toledo, Jose ;
Del Rio, Nicholas R. ;
Duck, Geraint ;
Furlong, Laura I. ;
Keath, Nichealla ;
Klassen, Dana ;
McCusker, James P. ;
Queralt-Rosinach, Nuria ;
Samwald, Matthias ;
Villanueva-Rosales, Natalia ;
Wilkinson, Mark D. ;
Hoehndorf, Robert .
JOURNAL OF BIOMEDICAL SEMANTICS, 2014, 5
[38]   Sound and efficient closed-world reasoning for planning [J].
Etzioni, O ;
Golden, K ;
Weld, DS .
ARTIFICIAL INTELLIGENCE, 1997, 89 (1-2) :113-148
[39]  
Etzold T, 1996, METHOD ENZYMOL, V266, P114
[40]   Executable cell biology [J].
Fisher, Jasmin ;
Henzinger, Thomas A. .
NATURE BIOTECHNOLOGY, 2007, 25 (11) :1239-1249