DeltaFS: A Scalable No-Ground -Truth Filesystem For Massively -Parallel Computing

被引:3
作者
Zheng, Qing [1 ]
Cranor, Charles D. [1 ]
Ganger, Gregory R. [1 ]
Gibson, Garth A. [1 ]
Amvrosiadis, George [1 ]
Settlemyer, Bradley W. [2 ]
Grider, Gary A. [2 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Los Alamos Natl Lab, Los Alamos, NM USA
来源
SC21: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS | 2021年
关键词
FILE SYSTEM; PERFORMANCE;
D O I
10.1145/3458817.3476148
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High-Performance Computing (HPC) is known for its use of massive concurrency. But it can be challenging for a parallel filesystem's control plane to utilize cores when every client process must globally synchronize and serialize its metadata mutations with those of other clients. We present DeltaFS, a new paradigm for distributed filesystem rnetadata. DeltaFS allows jobs to self-commit their munespace changes to logs, avoiding the cost of global synchronization. Followup jobs selectively merge logs produced by previous jobs as needed, a principle we term No Ground Truth which allows.for efficient data sharing. By avoiding unnecessary synchronization of ineladata operations, DeltaFS improves metadata operation throughput up to 93x leveraging parallelism on the nodes where job processes run. This speedup grows as job size increases. DeltaFS enables efficient inter-job communication, reducing overall workflow runtime by significantly improving client metadata operation latency up to 49x and resource usage up to 52x.
引用
收藏
页数:15
相关论文
共 86 条
[1]   Shared memory consistency models: A tutorial [J].
Adve, SV ;
Gharachorloo, K .
COMPUTER, 1996, 29 (12) :66-&
[2]  
Alam Sadaf R., 2011, P 6 WORKSH PAR DAT S, P13, DOI 10.1145/2159352.2159356
[3]  
Amvrosiadis G, 2018, PROCEEDINGS OF THE 2018 USENIX ANNUAL TECHNICAL CONFERENCE, P533
[4]   Serverless network file systems [J].
Anderson, TE ;
Dahlin, MD ;
Neefe, JM ;
Patterson, DA ;
Roselli, DS ;
Wang, RY .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1996, 14 (01) :41-79
[5]  
[Anonymous], 2014, OVERLAYFS
[6]  
[Anonymous], 2009, PROCCEDINGS 7 C FILE
[7]  
[Anonymous], 2010, 2010 USENIX ANN TECH
[8]  
[Anonymous], 2018, ISO/IEC 9899:2018 Information Technology- Programming Languages-C
[9]  
[Anonymous], 2016, APEX WORKFLOWS
[10]  
[Anonymous], 2020, IOR MDTEST