Storage Short Take #37

A Busy Storage Week

Dell makes one hell of a storage announcement, SNIA Swordfish is now an ISO standard, and major advances in NVMe uses.

As always, links were live at time of publication.


Storage Media and Technology

This is a technical article on using hardware acceleration to increase performance and scalability of NVMe controllers. The authors "propose an open-source ultralow-latency and high-throughput NVMe controller with a highly parallel, pipelined, and scalable architecture that accommodates one admin controller and multiple fully hardware-automated I/O controllers." These are interesting claims, but I don't currently have a subscription so I include this here for those who do and are curious. Feel free to let me know what you find, if you do:

Tom Coughlin captures the essence of dual-actuator HDDs and other HDD improvements, as presented at the Storage Developer Conference. It's just one topic out of many that was newsworthy.

This is a really good article on how you can be absolutely paranoid about backing up, and yet still lose your data. That reminds me, I've been getting errors recently...

Speaking of which, Backblaze's testing has indicated that SSDs may, in fact, be as unreliable as disk drives. While the testing isn't complete (it's going to be probably another year before the full data comes in), it does warrant some attention.

Ready to completely geek out? If so, check out this article on "Optimizing Garbage Collection Overhead of Host-Level Flash Translation Layer for Journaling Filesystems." Garbage collection is how the drive manages itself, but it creates performance problems. Research into how to improve the trade-off ratio is a very welcome endeavor.

Saw this on LinkedIn (getting harder and harder to find non-advertisements in my feed, but that's another story). My friend Stephen Bates (CTO of Eideticom) made a really astute observation about new and interesting use cases for NVMe [PDF]:

Enabling an asynchronous, fast and efficient path for issuing arbitary NVMe commands from user-space to the kernel is critical for the next evolution in NVM Express.

People are discovering is an amazing protocol for getting all sorts of work done on PCIe and Fabrics attached devices. As such they want their applications to issue pretty arbitary NVMe commands (SQEs) against these devices. The in-kernel driver does not have that much flexibility (and rightly so) so this work provides a great path for enabling things like Key-Value SSDs and #computationalstorage devices.

Kioxia has announced software-defined flash technology, specifically focused on providing flexibility for hyperscalers  The idea is that users of the drives can customize them for different types of workloads.

Anandtech has announced its choices for the best HDDs for September.

Storage Companies in the News

"We decided to focus on NVMe/TCP because it performs as well as NVMe/FC but scales to much higher speeds at significantly lower cost
Ihab Tarazi, SVP & CTO, Dell Technologies
The first item is a bombshell. Dell hasn't just thrown a gauntlet, they've shot it from a cannon: Shifting from Fibre Channel to TCP for NVMe over Fabrics traffic would have been unheard of at any point in time until now. This... this is a very big deal.

Anyone who has followed me or this blog for a while have heard me state the Storage Golden Rule: "Give me back the correct bit when I ask for it." When you rely on someone else's bit bucket, though, it doesn't absolve you of responsibility.

I was a bit amused by this headline from Blocks & Files: "Lightbits gets NVMe/TCP certification." Apparently VMware has blessed Lightbits with certification of the technology on vSphere 7 Update 3. Why is this amusing? Because Lightbits wrote NVMe/TCP. You kind of have to wonder what might be the baseline criteria for failure at that point instead.

If you've got Intel SSDs and/or Optane drives, you may want to check out Intel's Memory and Storage Tool. With it, you can test and report memory and storage performance, and keep an eye on any issues that may come up. Plus, it's free.

Synology just quietly announced support for Fibre Channel in DSM 7.0.1. Looks like they're using Marvell's Qlogic adapters.

Industry Associations and Standards

SNIA Swordfish™ receives ISO Standard accreditation.

Other industry news: CXL and PCIe-SIG announce a marketing MOU (Memorandum of Understanding). Why this matters: CXL is heavily based upon the physical PCIe protocols and electrical interfaces. Having coordination about what people learn about the developments of both technologies should be consistent.

Webinars, Blogs, and Conferences

Kamal Bakshi of Cisco has outlined a crap-ton of NVMe-oF demonstrations. Some of them are marketing-oriented, but a lot have useful, neutral info. I'm reprinting them here because there is a lot of information that will be useful. I'm keeping it to the technology, though, instead of the showcase of Cisco tech, because I think it's good to know and understand some of the useful work that Kamal has done for NVMe itself:

One of the big questions that came up at the Storage Developer Conference (awesome content, by the way) had to do with eBPF - the extended Berkeley Packet Filter. It's a technology that has applicability for SmartNICs, the Linux Kernel, and Storage. The co-chairs and editor of the Computational Storage Technical Working Group in SNIA wrote a blog about how eBPF can be used in Computational Storage environments.

Bonus Round

One of the reasons why speeds and feeds bore me to death is that they don't matter. Seriously. This time next year, the capabilities that vendors crow about will be considered passé. I also think that some people tend to use S&F conversations to mask flaws in their architecture or implementation. That's why this post by Greg Schulz from 2009 is so valuable - it's just as important today as it was when it was written (regardless of the S&F crap).

By the way, Greg is a great guy and a prolific author. If you haven't seen his library of storage texts, you really should do that soon.