NoSQL Schema Evolution and Big Data Migration at Scale

被引:0
作者
Klettke, Meike [1 ]
Stoerl, Uta [2 ]
Shenavai, Manuel [2 ]
Scherzinger, Stefanie [3 ]
机构
[1] Univ Rostock, Rostock, Germany
[2] Univ Appl Sci, Darmstadt, Germany
[3] OTH Regensburg, Regensburg, Germany
来源
2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2016年
关键词
NoSQL Databases; Schema Evolution; Data Migration Strategies; Lazy Migration; Lazy Composite Migration; Incremental Migration; Predictive Migration;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper explores scalable implementation strategies for carrying out lazy schema evolution in NoSQL data stores. For decades, schema evolution has been an evergreen in database research. Yet new challenges arise in the context of cloud-hosted data backends: With all database reads and writes charged by the provider, migrating the entire data instance eagerly into a new schema can be prohibitively expensive. Thus, lazy migration may be more cost-efficient, as legacy entities are only migrated in case they are actually accessed by the application. Related work has shown that the overhead of migrating data lazily is affordable when a single evolutionary change is carried out, such as adding a new property. In this paper, we focus on long-term schema evolution, where chains of pending schema evolution operations may have to be applied. Chains occur when legacy entities written several application releases back are finally accessed by the application. We discuss strategies for dealing with chains of evolution operations, in particular, the composition into a single, equivalent composite migration that performs the required version jump. Our experiments with MongoDB focus on scalable implementation strategies. Our lineup further compares the number of write operations, and thus, the operational costs of different data migration strategies.
引用
收藏
页码:2764 / 2774
页数:11
相关论文
共 14 条
  • [1] EQUIVALENCE AND OPTIMIZATION OF RELATIONAL TRANSACTIONS
    ABITEBOUL, S
    VIANU, V
    [J]. JOURNAL OF THE ACM, 1988, 35 (01) : 70 - 120
  • [2] Arenas M., 2010, SYNTHESIS LECT DATA, V2, P1, DOI DOI 10.2200/S00297ED1V01Y201008DTM008
  • [3] Cerqueus T., 2015, P BIGDSE 15
  • [4] Google Inc, 2016, GOOGL CLOUD DAT PRIC
  • [5] AN AXIOMATIC BASIS FOR COMPUTER PROGRAMMING
    HOARE, CAR
    [J]. COMMUNICATIONS OF THE ACM, 1969, 12 (10) : 576 - &
  • [6] Klettke M., 2015, P BTW 15
  • [7] MongoDB, 2016, MONGODB MAN VERS 3 2
  • [8] Rae I., 2013, PVLDB, V6
  • [9] Ringlstetter A., 2016, P BIGDSE 16
  • [10] Saur K., 2016, P ICSME 16