Blind Men and an Elephant Coalescing Open-source, Academic, and Industrial Perspectives on BigData

被引:0
作者
Douglas, Chris [1 ]
Curino, Carlo [1 ]
机构
[1] Microsoft, Cloud & Informat Serv Lab, Redmond, WA 98052 USA
来源
2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE) | 2015年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This tutorial is organized in two parts. In the first half, we will present an overview of applications and services in the BigData ecosystem. We will use known distributed database and systems literature as landmarks to orient the attendees in this fast-evolving space. Throughout, we will contrast models of resource management, performance, and the constraints that shape the architectures of prominent systems. We will also discuss the role of academia and industry in the development of open-source infrastructure, with an emphasis on open problems and strategies for collaboration. We assume only basic familiarity with distributed systems. In the second half, we will delve into Apache Hadoop YARN. YARN (Yet Another Resource Negotiator) transformed Hadoop from a MapReduce engine to a general-purpose cluster scheduler. Since its introduction, it has been deployed in production and extended to support use cases beyond large-scale batch processing. The tutorial will present the active research and development supporting such heterogeneous workloads, with particular attention to multi-tenant scheduling. Topics include security, resource isolation, protocols, and preemption. This portion will be detailed, but accessible to anyone with a background in distributed systems and all attendees of the first half of the tutorial.
引用
收藏
页码:1523 / 1526
页数:4
相关论文
共 50 条
[41]   Influence of open-source software on Bangladesh academic library service sustainability: a conceptual framework [J].
Ahammad, Nur ;
Bahry, Farrah Diana Saiful ;
Hussaini, Haslinda .
JOURNAL OF INFORMATION COMMUNICATION & ETHICS IN SOCIETY, 2024, 22 (03) :293-320
[42]   Accessibility for the Blind on an Open-Source Mobile Platform MObile Slate Talker (MOST) for Android [J].
Markus, Norbert ;
Malik, Szabolcs ;
Juhasz, Zoltan ;
Arato, Andras .
COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PT II, 2012, 7383 :599-606
[43]   The blind men and the elephant: on meeting the problem of multiple truths in data from clustering and pattern mining perspectives [J].
Arthur Zimek ;
Jilles Vreeken .
Machine Learning, 2015, 98 :121-155
[44]   The blind men and the elephant: on meeting the problem of multiple truths in data from clustering and pattern mining perspectives [J].
Zimek, Arthur ;
Vreeken, Jilles .
MACHINE LEARNING, 2015, 98 (1-2) :121-155
[45]   Preprocessor-based variability in open-source and industrial software systems: An empirical study [J].
Claus Hunsen ;
Bo Zhang ;
Janet Siegmund ;
Christian Kästner ;
Olaf Leßenich ;
Martin Becker ;
Sven Apel .
Empirical Software Engineering, 2016, 21 :449-482
[46]   KSC2: An Industrial-Scale Open-Source Kazakh Speech Corpus [J].
Mussakhojayeva, Saida ;
Khassanov, Yerbolat ;
Varol, Huseyin Atakan .
INTERSPEECH 2022, 2022, :1367-1371
[47]   Preprocessor-based variability in open-source and industrial software systems: An empirical study [J].
Hunsen, Claus ;
Zhang, Bo ;
Siegmund, Janet ;
Kaestner, Christian ;
Lessenich, Olaf ;
Becker, Martin ;
Apel, Sven .
EMPIRICAL SOFTWARE ENGINEERING, 2016, 21 (02) :449-482
[48]   Open-Source Electronics Platforms as Enabling Technologies for Smart Cities: Recent Developments and Perspectives [J].
Costa, Daniel G. ;
Duran-Faundez, Cristian .
ELECTRONICS, 2018, 7 (12)
[49]   How are informal diagrams used in software engineering? An exploratory study of open-source and industrial practices [J].
Jongeling, Robbert ;
Cicchetti, Antonio ;
Ciccozzi, Federico .
SOFTWARE AND SYSTEMS MODELING, 2025, 24 (03) :601-613
[50]   Emipy: An open-source Python']Python-based tool to analyze industrial emissions in Europe [J].
Overberg, Florian A. ;
Boettcher, Philipp C. ;
Witthaut, Dirk ;
Morgenthaler, Simon .
SOFTWAREX, 2023, 23