Peta-Scale Data Warehousing at Yahoo!

被引:0
|
作者
Ahuja, Mona [1 ]
Chen, Cheng Che [1 ]
Gottapu, Ravi [1 ]
Hallmann, Joerg [1 ]
Hasan, Waqar [1 ]
Johnson, Richard [1 ]
Kozyrczak, Maciek [1 ]
Pabbati, Ramesh [1 ]
Pandit, Neeta [1 ]
Pokuri, Sreenivasulu [1 ]
Uppala, Krishna [1 ]
机构
[1] Yahoo Inc, Sunnyvale, CA 94089 USA
来源
ACM SIGMOD/PODS 2009 CONFERENCE | 2009年
关键词
Column Database; MPP Database; Vector Query Processing; Column Storage; Data Warehousing; Analytics; Business Intelligence;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Insights based on detailed data on consumer behavior, product performance and marketplace behavior are driving innovation and competition in the internet space. We introduce Everest, a SQL-compliant data warehousing engine, based on a column architecture that we have built and deployed at Yahoo!. In contrast to commercially available engines, this massively parallel engine, based on commodity hardware, offers scale, flexibility, specialized analytic operations, and lower administrative & hardware costs. In this paper, we describe the business motivation and the software and deployment architecture of Everest. The engine is in production at Yahoo! since 2007 and currently manages over six petabytes of data.
引用
收藏
页码:855 / 861
页数:7
相关论文
共 50 条
  • [21] Comprehensive survey on data warehousing research
    Chandra P.
    Gupta M.K.
    International Journal of Information Technology, 2018, 10 (2) : 217 - 224
  • [22] Applications of data warehousing and data mining in the retail industry
    Li, HG
    2005 International Conference on Services Systems and Services Management, Vols 1 and 2, Proceedings, 2005, : 1047 - 1050
  • [23] Metadata management for data warehousing: An overview
    Vaduva, A
    Vetterli, T
    INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2001, 10 (03) : 273 - 298
  • [24] Survey of Big Data Warehousing Techniques
    Kaur, Jaspreet
    Shedge, Rajashree
    Joshi, Bharti
    INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 471 - 481
  • [25] An architecture for data warehousing supporting data independence and interoperability
    Cabibbo, L
    Torlone, R
    INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2001, 10 (03) : 377 - 397
  • [26] Beyond Conventional Data Warehousing - Massively Parallel Data Processing with Greenplum Database (Invited Talk)
    Waas, Florian M.
    BUSINESS INTELLIGENCE FOR THE REAL-TIME ENTERPRISE, 2009, 27 : 89 - 96
  • [27] Improving mine-to-mill by data warehousing and data mining
    Erkayaoglu, Mustafa
    Dessureault, Sean
    INTERNATIONAL JOURNAL OF MINING RECLAMATION AND ENVIRONMENT, 2019, 33 (06) : 409 - 424
  • [28] Using object deputy model to prepare data for data warehousing
    Peng, ZY
    Li, Q
    Feng, L
    Li, XH
    Liu, JQ
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (09) : 1274 - 1288
  • [29] Towards data warehousing and mining of protein unfolding simulation data
    Berrar D.
    Stahl F.
    Silva C.
    Rodrigues J.R.
    Brito R.M.M.
    Dubitzky W.
    Journal of Clinical Monitoring and Computing, 2005, 19 (4-5) : 307 - 317
  • [30] Performance Analysis of Indexing Techniques in Data Warehousing
    Jamil, Shawana
    Ibrahim, Rashda
    ICET: 2009 INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES, PROCEEDINGS, 2009, : 57 - 61