The Case for Cross-Entity Delta Encoding in Web Compression

被引:1
作者
Wollmer, Benjamin [1 ,3 ]
Wingerath, Wolfram [2 ,3 ]
Ferrlein, Sophie [3 ]
Panse, Fabian [1 ]
Gessert, Felix [3 ]
Ritter, Norbert [1 ]
机构
[1] Univ Hamburg, Hamburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Oldenburg, Germany
[3] Baqend, Hamburg, Germany
来源
WEB ENGINEERING (ICWE 2022) | 2022年 / 13362卷
关键词
Delta encoding; Caching; Dictionary compression;
D O I
10.1007/978-3-031-09917-5_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Delta encoding and shared dictionary compression (SDC) for accelerating Web content have been studied extensively in research over the last two decades, but have only found limited adoption in the industry so far: Compression approaches that use a custom-tailored dictionary per website have all failed in practice due to lacking browser support and high overall complexity. General-purpose SDC approaches such as Brotli reduce complexity by shipping the same dictionary for all use cases, while most delta encoding approaches just consider similarities between versions of the same entity (but not between different entities). In this study, we investigate how much of the potential benefits of SDC and delta encoding are left on the table by these two simplifications. As our first contribution, we describe the idea of cross-entity delta encoding that uses cached assets from the immediate browser history for content encoding instead of a precompiled shared dictionary: This avoids the need to create a custom dictionary, but enables highly customized and efficient compression. Second, we present an experimental evaluation of compression efficiency to hold cross-entity delta encoding against state-of-the-art Web compression algorithms. We consciously compare algorithms some of which are not yet available in browsers to understand their potential value before investing resources to build them. Our results indicate that cross-entity delta encoding is over 50% more efficient for text-based resources than compression industry standards. We hope our findings motivate further research and development on this topic.
引用
收藏
页码:177 / 185
页数:9
相关论文
共 10 条
[1]   Broth: A General-Purpose Data Compressor [J].
Alakuijala, Jyrki ;
Farruggia, Andrea ;
Ferragina, Paolo ;
Kliuchnikov, Eugene ;
Obryk, Robert ;
Szabadka, Zoltan ;
Vandevenne, Lode .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2019, 37 (01)
[2]  
Chan MC, 1999, IEEE INFOCOM SER, P117, DOI 10.1109/INFCOM.1999.749259
[3]  
Knecht D.O., 2019, METHOD APPARATUS RED
[4]  
Korn DG, 2002, USENIX ASSOCIATION PROCEEDINGS OF THE GENERAL TRACK, P219
[5]  
McQuade B., 2016, A proposal for shared dictionary compression over http
[6]  
Mogul J.C., 1997, PROC ACM SIGCOMM C A, V27, P181, DOI [10.1145/263109.263162, DOI 10.1145/263109.263162]
[7]  
Shapira Omer, 2015, SDCH LINKEDIN
[8]   Speed Kit: A Polyglot & GDPR-Compliant Approach For Caching Personalized Content [J].
Wingerath, Wolfram ;
Gessert, Felix ;
Witt, Erik ;
Kuhlmann, Hannes ;
Bucklers, Florian ;
Wollmer, Benjamin ;
Ritter, Norbert .
2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, :1603-1608
[9]  
Wollmer Benjamin, 2020, 20 INT C WEB ENG
[10]  
Wollmer Benjamin, 2022, 22TH INT C WEB ENG