Software-Defined Infrastructure for Decentralized Data Lifecycle Governance: Principled Design and Open Challenges

被引:61
作者
Huang, Gang [1 ]
Luo, Chaoran [1 ]
Wu, Kaidong [1 ]
Ma, Yun [1 ]
Zhang, Ying [1 ]
Liu, Xuanzhe [1 ]
机构
[1] Peking Univ, MoE, Key Lab High Confidence Software Technol, Beijing 100871, Peoples R China
来源
2019 39TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2019) | 2019年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
data lifecycle governance; software defined; decentralized;
D O I
10.1109/ICDCS.2019.00166
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Exploring and mining the explosive burst of "big data" has already generated a lot of innovative applications, especially the recent advances of AI applications, and thus produced big values to the human society and civilization. However, due to the centralized patterns of data governance activities, including creation, sharing, exchange, management, analytics, tracing, and accounting, the potential values of big data distributed on the Internet are far away from being adequately explored. The recent announcement of data protection policies/laws such as GDPR makes the problem even more challenging. We are now at a moment of truth where the data governance infrastructure should be reconsidered and redesigned. In this paper, we propose a software-defined infrastructure design in a decentralized fashion: data owners are able to implement and deploy their own rules to the application systems where the data are produced for further governance activities. Such a fashion is quite similar to the popular software-defined networking where users are allowed to deploy rules of switches and customize the uses. Our principled infrastructure design can radically reform the current data governance activities into a decentralized topology. On the one hand, data can be separated from the application that generates the data, and data owners can have the full rights to decide where their data should be stored and how the data can be shared. On the other hand, data users can search, discover, integrate, and analyze the data from various data sources according to their application requirements and scenarios. As a result, we argue that our infrastructure can establish a new generation of responsive decentralized data governance that can promote the innovation of linking data to the better adaptation of the open environment and the diverse user requirements. With this perspective, we briefly discuss some key insights and enumerate several related new technologies and open challenges.
引用
收藏
页码:1674 / 1683
页数:10
相关论文
共 30 条
  • [1] Deep Learning with Differential Privacy
    Abadi, Martin
    Chu, Andy
    Goodfellow, Ian
    McMahan, H. Brendan
    Mironov, Ilya
    Talwar, Kunal
    Zhang, Li
    [J]. CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, : 308 - 318
  • [2] [Anonymous], [No title captured]
  • [3] [Anonymous], [No title captured]
  • [4] [Anonymous], [No title captured]
  • [5] [Anonymous], [No title captured]
  • [6] [Anonymous], [No title captured]
  • [7] [Anonymous], 2014, White Paper
  • [8] [Anonymous], 2012, 10 USENIX S OPERATIN
  • [9] [Anonymous], IEEE T SERVICES COMP
  • [10] [Anonymous], 2013, J PRIVACY CONFIDENTI