There is quite a bit of hype right now about NVMe and its corresponding sister standard, NVMe over Fabrics (NVMe-oF). Along with hype comes a bit of confusion as well, though, so I have found myself talking about a number of the different Fabrics (as NVMe-oF is short-cutted), where they fit, where they might not fit, and even how they work.
I’ve decided that, whenever I can scrape together the time, to discuss some of the technologies about NVMe over Fabrics, as well as some alternatives that are emerging on the market that are not standards-based. This does not mean they aren’t good, or that they don’t work: it just means that they are a different way of solving the problem that the NVM Express group has laid out.
The first technology I’m going to talk about is Fibre Channel (FC). Quite frankly, there has been very little discussed about NVMe and FC when it comes to Fabrics, and I think it might be a good idea to talk a bit about some of the reasons why FC may make a good choice for some users of NVMe-oF.
Background of NVMe-oF
If you’re not familiar with NVMe, I recommend that you start with a decent program of study to learn about what it is. I’m making the assumption that you have already had some exposure to NVMe, if not NVMe-oF at this point (but in case you haven’t, I’ve put together a bibliography for you).
NVMe over Fabrics is a way of extending access to non-volatile memory (NVM) devices, using the NVM Express (NVMe) protocol, to remote storage subsystems. That is, if you want to connect to these storage devices over a network, NVMe-oF is the standard way to do it.
As I’ve discussed before, the standard is completely network-agnostic. If you have a storage network, whether it be InfiniBand, Ethernet, or Fibre Channel, there are ways of transporting NVMe commands across.
At the moment, the hottest topics are the versions that can be sent over Ethernet, because of the technology’s ubiquity and general flexibility in deployment options. RDMA-based technologies, such as RoCE, iWARP, and iSER are based upon InfiniBand-like behaviors working on Ethernet transportation, and dominating the discourse on Fabrics at the moment. I’ll be addressing those in turn, in later posts.
Fibre Channel and NVMe-oF (Geek Alert)
Like Ethernet, Fibre Channel has a layered model of its stack. At the upper layer, there is an entity called the FCP (Fibre Channel Protocl – how recursive!).
FCP is used to connect to even higher upper-layer protocols, such as SCSI and FICON. As it’s written now, Fibre Channel does not have a RDMA protocol, so FCP is used instead. What this means is that FCP provides a way of creating a “connection”, or association between participating ports, so that NVMe devices are treated as, well, NVMe devices (instead of mimicking or “mapping” them).
In other words, just like Fibre Channel creates direct links between the SCSI protocol at the FCP layer between end devices, it does the exact same thing for the NVMe layer between end devices.
The Fibre Channel Advantage
There are three key pieces to the puzzle that go beyond just the protocol, however, that give FC a distinct advantage to users at the moment who may be trying to determine which fabric is appealing.
1) Fibre Channel is a dedicated storage solution
This may seem trivial and obvious, but it actually makes a big difference.
NVMe is, for all intents and purposes, a “new” storage protocol technology. It works quite well, but there are enough nuances to change the nature of the way systems are architected and designed. RDMA-based protocols are not “natively” designed specifically for storage – on the contrary, they were designed for inter-process communication (IPC) between compute nodes with direct memory access (that’s the DMA part of RDMA – the “R” stands for “Remote”).
So, while RDMA-based solutions are extremely useful and show promise, they simply don’t have the track record for reliability and availability for storage. In fact, one of the big debates going on at the moment is which RDMA-based protocol is the most intuitive to deploy.
Fibre Channel, on the other hand, is a well-understood quantity. There is a reason why 80+% of Flash storage systems use Fibre Channel as the protocol of choice – it handles the demands of that kind of traffic with remarkable adeptness.
In other words, for customers who are trying to minimize the moving parts of a new storage paradigm, they may find that keeping the change down to one variable, that of NVMe-oF, to be something of a comfort.
2) FIbre Channel has a robust discovery and Name Service
When you put devices onto a network that need to communicate with other devices, there has to be a way to find one another.
As Erik Smith pointed out (better than I could have done), Fibre Channel is a “network-centric” solution. The network controls what each end-device will be connected to. That is, the nature of the FC fabric is such that it knows every device that is connected. This makes the onerous job of discovering and controlling’managing access to devices much easier.
This feature cannot be understated. By comparison, iSCSI’s name service – Internet Storage Name Service (iSNS) – has often been criticized for performance, reliability, and responsiveness in real-world deployments. In fact, for many iSCSI deployments a iSNS simply isn’t used – end devices are manually configured to find their corresponding storage. Even so, it is a brilliant system, but you would be hard-pressed to find someone who would argue that the available versions are as good as, or reliable as, the FC Name Service.
Why bring up iSNS? Because the NVMe version of the Name Service is based heavily on the iSNS model. (Note: I am not dismissing or denigrating either iSNS or the NVMe name service. I’m simply saying that the Fibre Channel name server is practically bullet-proof by comparison).
Every Fibre Channel device – from any device manufacturer – will use this method of discovery and fabric registration. That kind of guarantee simply cannot be made for other, Ethernet-based systems.
3) Qualification and Support
I’ve saved the most important reason for last.
Unlike Ethernet-based devices (I’m deliberately leaving out InfiniBand, because there is really only one manufacturer for IB devices, realistically), Fibre Channel solutions are qualified end-to-end, from Operating System to Array.
Manufacturers of FC devices undergo very painful, very expensive qualification and testing procedures to make sure that when a FC solution is implemented, customers will not have to struggle to find out – painfully and on their own – that driver mismatch and hardware incompatibilities have meant they’ve made a very expensive mistake.
It will be a long, long time before such an end-to-end qualification matrix will exist for Ethernet-based systems – if ever. There are simply too many various end-device manufacturers for each of the different configuration possibilities to be tested. Sure, plug-fests help (and they do!) but the cost of the burden is prohibitive with Ethernet solutions – at least to the level and degree that it exists for Fibre Channel.
This becomes extremely important as we start to understand new NVMe-oF drivers, hardware, discovery protocols, and RDMA implementations. Add to that the challenge of guaranteeing network QoS and end-to-end integrity, and there are going to be several missteps before things get ironed out industry-wide.
Without question, Fibre Channel has the lead in this area, hands-down.
This is the first of hopefully, several pieces on the different fabric solutions that can apply to NVMe over Fabrics. I hope to be able to get to others – RoCE, iWARP, iSER, InfiniBand, as well as other non-standard approaches – soon.
My intent here is not to say that one technology (i.e., Fibre Channel) is better than other forms, just that there are distinct advantages that – as of this writing – appear to give FC something worthwhile to look at. (RDMA-based protocols have distinct advantages as well, but one thing at a time).
As always, your mileage will vary. All I’m hoping for is to provide a means for people to think about the new technology in a way they may not have thought of before.
[Update: The FCIA is presenting a webinar on FC-NVMe on February 16. Register for the event, even if you can’t attend live, so that you can watch it at your leisure.]