An Empirical Study of (Multi-) Database Models in Open-Source Projects

被引:4
作者
Benats, Pol [1 ]
Gobert, Maxime [1 ]
Meurice, Loup [1 ]
Nagy, Csaba [2 ]
Cleve, Anthony [1 ]
机构
[1] Univ Namur, Namur Digital Inst, Namur, Belgium
[2] Univ Svizzera Italiana, Software Inst, Lugano, Switzerland
来源
CONCEPTUAL MODELING, ER 2021 | 2021年 / 13011卷
关键词
Data models; Open-source projects; Empirical study; EVOLUTION; GITHUB;
D O I
10.1007/978-3-030-89022-3_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Managing data-intensive systems has long been recognized as an expensive and error-prone process. This is mainly due to the often implicit consistency relationships that hold between applications and their database. As new technologies emerged for specialized purposes (e.g., graph databases, document stores), the joint use of database models has also become popular. There are undeniable benefits of suchmulti-database models where developers combine various technologies. However, the side effects on design, querying, and maintenance are not well-known yet. In this paper, we study multi-database models in software systems by mining major open-source repositories. We consider four years of history, from 2017 to 2020, of a total number of 40,609 projects with databases. Our results confirm the emergence of hybrid data-intensive systems as we found (multi-) database models (e.g., relational and non-relational) used together in 16% of all database-dependent projects. One percent of the systems added, deleted, or changed a database during the four years. The majority (62%) of these systems had a single database before becoming hybrid, and another significant part (19%) became "mono-database" after initially using multiple databases. We examine the evolution of these systems to understand the rationale of the design choices of the developers. Our study aims to guide future research towards new challenges posed by those emerging data management architectures.
引用
收藏
页码:87 / 101
页数:15
相关论文
共 30 条
[1]   Supporting Analysis of SQL Queries in PHP AiR [J].
Anderson, David ;
Hills, Mark .
2017 IEEE 17TH INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM), 2017, :153-158
[2]  
Basciani F., 2020, MODELS 2020
[3]  
Benats P., REPL PKG
[4]  
Bernstein P.A., 2007, SIGMOD 07, P1
[5]  
Bird C, 2011, P 19 ACM SIGSOFT S 1, P4, DOI DOI 10.1145/2025113.2025119
[6]   What's in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform [J].
Borges, Hudson ;
Valente, Marco Tulio .
JOURNAL OF SYSTEMS AND SOFTWARE, 2018, 146 :112-129
[7]   Understanding database schema evolution: A case study [J].
Cleve, Anthony ;
Gobert, Maxime ;
Meurice, Loup ;
Maes, Jerome ;
Weber, Jens .
SCIENCE OF COMPUTER PROGRAMMING, 2015, 97 :113-121
[8]   A Survey on NoSQL Stores [J].
Davoudian, Ali ;
Chen, Liu ;
Liu, Mengchi .
ACM COMPUTING SURVEYS, 2018, 51 (02)
[9]   An empirical comparison of dependency network evolution in seven software packaging ecosystems [J].
Decan, Alexandre ;
Mens, Tom ;
Grosjean, Philippe .
EMPIRICAL SOFTWARE ENGINEERING, 2019, 24 (01) :381-416
[10]  
Decan Alexandre, 2015, POST P 8 SEM ADV TEC, V1820, P26