Thursday, February 13

ZFS on Linux should get a persistent SSD read cache feature soon

Intel's Optane persistent memory is widely considered the best choice for ZFS write buffer devices. But L2ARC is more forgiving than SLOG, and larger, slower devices like standard consumer M.2 SSDs should work well for it, too.

Enlarge / Intel's Optane persistent memory is widely considered the best choice for ZFS write buffer devices. But L2ARC is more forgiving than SLOG, and larger, slower devices like standard consumer M.2 SSDs should work well for it, too. (credit: CCASA 4.0 Jacek Halicki)

Today, a request for code review came across the ZFS developer's mailing list. Developer George Amanakis has ported and revised code improvement that makes the L2ARC—OpenZFS's read cache device feature—persistent across reboots. Amanakis explains:

The last couple of months I have been working on getting L2ARC persistence to work in ZFSonLinux.

This effort was based on previous work by Saso Kiselkov (@skiselkov) in Illumos (https://www.illumos.org/issues/3525), which was later ported by Yuxuan Shui (@yshui) to ZoL (https://ift.tt/38qrvLe), subsequently modified by Jorgen Lundman (@lundman), and rebased to master with multiple additions and changes by me (@gamanakis).

The end result is in: https://github.com/zfsonlinux/zfs/pull/9582

For those unfamiliar with the nuts and bolts of ZFS, one of its distinguishing features is the use of the ARC—Adaptive Replacement Cache—algorithm for read cache. Standard filesystem LRU (Least Recently Used) caches—used in NTFS, ext4, XFS, HFS+, APFS, and pretty much anything else you've likely heard of—will readily evict "hot" (frequently accessed) storage blocks if large volumes of data are read once.

By contrast, each time a block is re-read within the ARC, it becomes more heavily prioritized and more difficult to push out of cache as new data is read in. The ARC also tracks recently evicted blocks—so if a block keeps getting read back into cache after eviction, this too will make it more difficult to evict. This leads to much higher cache hit rates—and therefore lower latencies, and more throughput and IOPS available from the actual disks—for most real-world workloads.

Read 4 remaining paragraphs | Comments

No comments:

Post a Comment