CrossPrefetch: Accelerating I/O Prefetching for Modern Storage

被引：1

作者：

Garg, Shaleen ^{[1
]}

Zhang, Jian ^{[1
]}

Pitchumani, Rekha ^{[2
]}

Parashar, Manish ^{[3
]}

Xie, Bing ^{[4
]}

Kannan, Sudarsun ^{[1
]}

机构：

[1] Rutgers State Univ, Piscataway, NJ 08855 USA

[2] Samsung, Ridgefield Pk, NJ USA

[3] Univ Utah, Salt Lake City, UT USA

[4] Microsoft, Redmond, WA USA

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2024, VOL 1 | 2024年

关键词：

D O I：

10.1145/3617232.3624872

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We introduce CrossPrefetch, a novel cross-layered I/O prefetching mechanism that operates across the OS and a user-level runtime to achieve optimal performance. Existing OS prefetching mechanisms suffer from rigid interfaces that do not provide information to applications on the prefetch effectiveness, suffer from high concurrency bottlenecks, and are inefficient in utilizing available system memory. CrossPrefetch addresses these limitations by dividing responsibilities between the OS and runtime, minimizing overhead, and achieving low cache misses, lock contentions, and higher I/O performance. CrossPrefetch tackles the limitations of rigid OS prefetching interfaces by maintaining and exporting cache state and prefetch effectiveness to user-level runtimes. It also addresses scalability and concurrency bottlenecks by distinguishing between regular I/O and prefetch operations paths and introduces fine-grained prefetch indexing for shared files. Finally, CrossPrefetch designs low-interference access pattern prediction combined with support for adaptive and aggressive techniques to exploit memory capacity and storage bandwidth. Our evaluation of CrossPrefetch, encompassing microbenchmarks, macrobenchmarks, and real-world workloads, illustrates performance gains of up to 1.22x-3.7x in I/O throughput. We also evaluate CrossPrefetch across different file systems and local and remote storage configurations.

引用

页码：102 / 116

页数：15

共 50 条

[41] Automatic compiler-inserted I/O prefetching for out-of-core applications
Mowry, TC
Demke, AK
Krieger, O
PROCEEDINGS OF THE SECOND SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '96), 1996, : 3 - 17
[42] Accelerating sequential programs on Chip Multiprocessors via Dynamic Prefetching Thread
Rui, Hou
Zhang, Longbing
Hu, Weiwu
MICROPROCESSORS AND MICROSYSTEMS, 2007, 31 (03) : 200 - 211
[43] Adaptive prefetching and storage reorganization in a log-structured storage system
Chee, CL
Lu, HJ
Tang, H
Ramamoorthy, CV
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1998, 10 (05) : 824 - 838
[44] Adaptive prefetching and storage reorganization in a log-structured storage system
Natl Univ of Singapore, Singapore
IEEE Trans Knowl Data Eng, 5 (824-838):
[45] Accelerating Flash-X Simulations with Asynchronous I/O
Jain, Rajeev
Tang, Houjun
Dhruv, Akash
Harris, J. Austin
Byna, Suren
2022 IEEE/ACM INTERNATIONAL PARALLEL DATA SYSTEMS WORKSHOP (PDSW), 2022, : 13 - 19
[46] Parallel I/O and storage technology
Thakur, R
Hempel, R
Shriver, E
Brezany, P
EURO-PAR 2000 PARALLEL PROCESSING, PROCEEDINGS, 2000, 1900 : 1251 - 1252
[47] Pipelining network storage I/O
Zeng, Lingfang
Feng, Dan
Wang, Fang
COMPUTATIONAL SCIENCE - ICCS 2006, PT 1, PROCEEDINGS, 2006, 3991 : 1063 - 1066
[48] Why Does Data Prefetching Not Work for Modern Workloads?
Naderan-Tahan, Mahmood
Sarbazi-Azad, Hamid
COMPUTER JOURNAL, 2016, 59 (02) : 244 - 259
[49] Improving I/O performance through compiler-directed code restructuring and adaptive prefetching
Son, Seung Woo
Kandemir, Mahmut
Karakoy, Mustafa
2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 2485 - +
[50] Adaptive Prefetching for Accelerating Read and Write in NVM-based File Systems
Zheng, Shengan
Mei, Hong
Huang, Linpeng
Shen, Yanyan
Zhu, Yanmin
2017 IEEE 35TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2017, : 49 - 56

← 1 2 3 4 5 →