Last week, I had a very interesting email conversation with Chris Mellor, storage writer for The Register. As a trade press reporter, Chris has been trying to distill some of the technologies of FCoE for his readers and one of his articles prompted me to write to him and offer some corrections and clarification.
At first I thought that Chris’ article might have simply been a matter of laziness or FUD, but I didn’t want to jump to conclusions about his motives – and I’m glad that I didn’t. In a very thorough email outlining where he got his information I can not only fathom how he came to understand things the way he did, but also empathize with his frustration as a result.
In short, it’s not his fault. At all. He’s frustrated, and quite frankly so am I.
Sometimes I feel like Sisyphus, condemned to roll his stone up the hill only to be robbed at the summit by having it roll back to the bottom. I’ve been trying very hard to be as accurate as I can with respect to FCoE, but there are some pretty heavy hitters lined up against me who knock that stone back down to the bottom of the hill.
Chris asked some very pointed (and excellent!) questions about how FCoE works and how the standards fit into the process. He cited quotations from numerous white papers and vendor documentation that should have been 1) technically accurate and 2) marketing neutral. They were neither.
We’re not talking little faux pas here, we’re talking things that are just flat-out wrong.
He had quote after quote from several vendors (not just one) that would have lead any reasonable person to think the exact opposite of how FCoE actually works.
Muddying The Waters
A couple of examples that Chris provided me are worth illustrating the point (vendor names have been removed to protect the guilty):
Vendor #1: “The enhanced transmission selection (ETS) algorithm will strengthen the ability of FCoE to reliably use Ethernet as a transport layer and minimize the chance of link congestion and frame loss.”
ETS does nothing of the kind. It has nothing to do with reliability, congestion, or frame loss. ETS has to do with bandwidth allocation and groupings.
Vendor #2: “PFC allows Fibre Channel storage traffic encapsulated in FCoE frames to receive lossless service from a link that is being shared with traditional LAN traffic, which is loss-tolerant.”
While technically correct, the sentence implies that FCoE and LAN traffic are thrown willy-nilly onto the link and that one type can interrupt the other. This is misleading.
Vendor #3: “For example, with PFC, if storage traffic has a higher priority than LAN traffic and a large storage transfer causes congestion, PFC can be engaged to pause the storage transfer and let the LAN transfer proceed.”
In this case “priority” is used in the colloquial sense, i.e., there is a hierarchy of prioritization where some traffic is more important than others. In the case of DCB networks, “priority” is a misnomer because it relates to classes of service rather than how important they are. In either case, PFC does not “let LAN transfer proceed,” it focuses on making one type of traffic lossless – that’s all. It has nothing whatsoever to do with acting as a traffic cop for permitting which traffic should be transmitted and which should not.
Vendor #1 again: “Based on the priority information collected through PAUSE, the server stops sending any traffic for that specific application while the other applications continue to make progress without disruption on the shared link.”
PFC (nor PAUSE – these are two completely different mechanisms, though the terminology is reused) does not “collect priority information.” This is pure fantasy. As a result, neither PAUSE nor PFC have any role in whether or not “other applications continue to make progress without disruption on the shared link.”
Vendor #4 (Ironically from a company that does not even have a FCoE product currently offered): “Typically in the data center FCoE traffic will be assigned to the higher priority classes. This ensures that congestion due to other less sensitive traffic between servers will not cause loss of FCoE storage traffic.”
Again the definition of “priority” causes havoc here. Typically FCoE is assigned to CoS (or “priority”) 3. Now, this means that you will have to be careful if you’ve assigned other types of traffic to CoS/”priority” 3, but it does not mean that FCoE is any more or less important than traffic on, say, priority 2 or 5. These are non-hierarchical lanes.
Again, from Vendor #1 (different white paper): “In addition, 802.1Qbb can leverage prioritization to establish bandwidth allocation on a per-application basis. Time-sensitive applications such as inter-process communications (IPC) can be given a higher percentage of available bandwidth as needed while other applications are assured portions of the remaining available bandwidth.”
802.1Qbb (PFC) says no such thing. Priority Flow Control is, well, Priority Flow Control. Bandwidth allocation is part of 802.1Qaz, Enhanced Transmission Selection. That’s controls the transmission selection and enhances it (i.e., bandwidth management). (By the way, the Qaz document also defines how devices can communicate configuration information and establish correct settings, also called DCBX.)
This is not an exhaustive list of what he shared with me, or that I have found for myself. Let’s not even get into the confusion surrounding QCN (Congestion Notification) and TRILL at this point. It’s sufficient to point out that all the errors mentioned above are compounded.
Is it any wonder this is frustrating? Remember these come from technical white papers not marketing or press releases!
So Why Bother?
Despite some criticisms to the contrary, and the nature of my role at Cisco (if this is the first time reading my blog, in the interest of transparency it should be noted I am a Product Manager for FCoE for the Data Center Business Unit) it might be easy to assume that it may be a case of “the PM protests too much!” After all, it’s in my best interest (and my company’s) to promote FCoE at all costs, no matter what, damn the torpedoes full speed ahead!
Actually, not-so-much. It’s in my best interest to make customers and partners understand the role storage-over-Ethernet (whether it be FCoE or not) plays in their Data Center, even if that role is “not at all.”
I have tried to be consistent in my approach: I have never said (nor would I state) that FCoE is right for all customers in all occasions in all situations.
I do think that for certain situations and for certain customers FCoE is a very cool technology that can allow for some very interesting (and cost effective!) solutions to long-term issues within the Data Center.
But how are those customers supposed to know if they are the ones who would benefit the most? How are they supposed to make decisions if all they get is crap like this? Poor Chris, whose job it is to break it down into plain English, is stuck because he relies on the very same documents are completely unreliable!
I mean, really, can he (or anyone else) be blamed when public statements made by these vendors about aspects of FCoE cannot be trusted?
I don’t think so.
The practical upshot is that there is a gap between how things really work and what customers (and partners) are learning about the technology. For my part, while I may not have any control over what other vendors are writing about, I do have some influence over what my company presents (limited though it may be), and I have 100% control over the accuracy and accountability of my own tweets, blogs, and presentations.
As a result, I strive very hard to be as accurate as I can be. Things change, and sometimes I’m flat-out incorrect. But at least I can make the promise to remain accountable to what I write.
Obviously I can’t stop the misinformation single-handedly, but it doesn’t mean that I can’t get that rock up to the summit. If anyone wants to help, come on board.
If it seems like I’m being pedantic here, complaining about the incorrect identification of PFC when they really mean ETS (or vice versa), look at it this way:
The building blocks that make up FCoE can be arranged to create some very complex and creative solutions. If you don’t know what those building blocks are, how can you expect to use them properly?
Here’s a recent product announcement for a Dell PowerConnect 8024F Switch. If you look at the graphics on the page it implies that it can do a lot of different things. It does, for instance, claim to be “Unified Fabric Ready” right next to a very beautiful picture of LAN/SAN/IPC traveling within the same wire.
So far so good, right?
With all the talk about FCoE, Priority Flow Control, LAN/SAN convergence, can you determine:
- Whether this switch can run FCoE? After all, it mentions Unified Fabric and 802.1Qbb/PFC
- Take part in device discovery and configuration with other DCB switches?
- Where it might fit into an FCoE fabric?
- Bonus: Whether this switch can handle InfiniBand traffic (implied by IPC)
As you can see, those little building blocks suddenly make a huge difference.
You can subscribe to this blog to get notifications of future articles in the column on the right. You can also follow me on Twitter: @jmichelmetz
Official Disclaimer: Some of the individuals posting to this site, including the moderators, work for Cisco Systems, Inc. Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not those of Cisco.