Re-Examining FCoE and iSCSI Pros and Cons

It should be obvious that I’ve been doing a lot of thinking about FCoE over the past 2 years, despite only blogging about it for a couple of months (hey, I’ve been busy!). But someone brought up something recently that makes me second-guess the role that FCoCEE will play in the marketplace in comparison to, say, iSCSI. In particular, with vendors now offering 1 Million IOPS in their NICs, is there enough of an advantage performance-wise?

Now, this is more of an open-ended, brainstorming thinking-out-loud kind of blog post, as opposed to a declarative “this is my carefully considered analysis” kind of post. I mention this up front because this is usually when I make boneheaded statements about things about which I actually know better, but somehow forget when thinking through the process.

I’m not the first to do it on this subject. Scott Lowe did it with a “thinking out loud” post, to which Chad Sakac wrote a fantastic response.

I say this specifically because I have no intention of throwing any person, technology, company, or strategy under any fast-moving public transportation vehicles. Having said that my thoughts, then, examine a couple of different aspects about FCoE adoption as it relates to the overall business case.

Here’s what FCoE has going for it:

1) It’s Fibre Channel (FC).

Despite running over 10GbE, the protocol is pure FC, which is well-understood. The tools and management, provisioning, administration, etc. are familiar to enterprise data centers and does not require a wholesale shift over to alternative SAN protocols.

2) It’s cool.

Large, enterprise data centers with oodles of servers, networks and storage have a wee problem of power and cooling. As I’ve mentioned before, it wasn’t uncommon for customers to have servers with up to 16 NIC/HBAs in their servers.

For grins and giggles, I decided to take a quick cost comparison of Cat 6a vs. TwinAx® power consumption (TwinAx® is the copper cabling appropriate for TOR switching used for FCoE).

Without even looking at the costs of the FC side of things, just by comparing the Ethernet cabling costs was striking. Cat 6a runs at 16w of power, TwinAx® runs at .1w (yes, that’s point-one watts).

The cost of running one Cat 6a Cable for a year was $14.02 (assuming a very generous $.10 kw/h price tag).

The cost of running one TwinAx® cable for a year was $.09.

Skip to the punchline: One data center with 1000 servers, cost of power and cooling:

  • Cat 6a: $70,100/yr
  • TwinAx®: $225/yr

This is just for cabling. That’s not even including the power reduction from not needing to have both NICs and HBAs (let alone multiples of each) in every server.

3) It’s got a distinct performance advantage.

Despite the hype about IOPS (which are important, don’t get me wrong!), it’s not the whole story. The issue with latency becomes a big deal, especially the farther up the TCP/IP stack you go. As Chad points out, anything that is dependent upon TCP/IP (like iSCSI you’ll have to deal with ARP traffic, which can delay transmission in terms of seconds.

For those applications with strict SLAs, that’s simply unacceptable. FCoE keeps the link negotiation to a minimum, expanding the Ethernet frame size to 2.5 k (which means that you should look at the results at the next highest IOPS measurement, 4k, when comparing vendor capabilities for FCoE).

With the link rate increasing from 10Gb to 40 to 100, it seems to me that you’re looking to ask the TCP stack to multiply that number of overhead transactions per second. The flow control mechanisms – particularly variable flow control and windowing – adds even more overhead parameters that FCoE simply doesn’t have to encounter. If I’m correct about this (and there’s no guarantee I am, after all I’m only a lowly CCNA not a CCIE or iSCSI guru), it seems that the faster you go with iSCSI the bigger a performance hit you will take. Am I wrong about this?

4) It’s evolutionary, not revolutionary.

There’s no “R n’ R” when it’s Rip and Replace. FCoE allows Data Center managers to implement FCoE from the perspective of “adding on,” not create an additional SAN silo that can’t be connected to existing infrastructure, nor must they replace thousands of NICs and HBAs with CNAs.

What this means is that FCoE becomes a more longitudinal, strategic approach to the data center. In many ways, this is different than the situational approach to determining connectivity that has been in practice for some time. Planning for long-range integration of legacy (and future – 40, 100GbE) systems becomes key, but ultimately rewarding in providing a unified business plan, not just a unified technological fabric.

5) It’s “wire-once.”

As Chad points out in his blog post, CEE (and by extension FCoE) means that once you wire for FCoE it’s automatically applicable to a broader market (LAN/NAS/iSCSI). Given #4, this means that FCoE infrastructure opens up the data center to broader use of ethernet storage.

Because of these reasons – and several others – FCoE has been touted as “the Next Generation Data Center.”

However, the prediction of a victorious FCoE, triumphant over a hapless iSCSI seems a bit premature. For one thing, as noted above, there have been several recent announcements of iSCSI/Hyper-V performance records that are, indeed, impressive.

Here’s what iSCSI has going for it:

1) It’s cheap.

Let’s face it. iSCSI is a fantastic technology for the non-enterprise space. The overall sandbox for where iSCSI plays is much, much larger than enterprise data centers that will need the type and kind of connectivity that FCoE is built for.

As David Dale pointed out over a year ago, iSCSI’s “sweet spot” is “in support of windows or Virtual Server environments: in tier 2 and tier 3 data centers in large organizations; in the core data center of small/medium enterprises; and in remote offices. In other words, iSCSI is usually chosen for environments that FC has had difficulty penetrating due to cost, complexity, functionality, and support issues.”

Indeed, FCoE is geared towards maintaining backwards compatibility with existing FC infrastructures, and is “unlikely to replace or displace continued iSCSI growth in these sweet spot environments.”

Nevertheless, 10GbE is not free. It involves purchasing 10GbE switches, NICs, and other CapEx considerations that indicates that the same type of budgetary constraints will be placed on iSCSI, even if the initiator portion of it is “free” in software.

2) It’s in a growth phase.

Because iSCSI was closely tied to Windows growth in Tier 2 and Tier 3 data centers (as opposed to FC in Unix environments in the Tier 1 and Tier 2 DCs), it’s easy to see that as the Windows deployment numbers increase, so will iSCSI.

According to IDC, iSCSI has consistently grown faster than the overall networked storage market:

Through the third quarter of this year iSCSI was expected to account for 13% of revenues in the networked storage market, with Fibre Channel  accounting for 61% and NAS for 26%. In terms of capacity, iSCSI accounts for 15% of the networked storage market, with Fibre Channel SANs at a 52% market share and NAS with the remaining 33% of the market.

In terms of revenue growth, IDC estimates that the iSCSI market will grow 58% this year, vs. 17% for the overall networked storage market. And in terms of capacity growth, the iSCSI market is expected to surge 117% this year, vs. 90% for the total networked storage market (David Dale, Dec. 21, 2009).

3) It’s popular.

There are still a lot of Open Source people who are fond of the fact that iSCSI permits them to avoid vendor lock-in. It’s almost impossible to look at any FCoE vs. iSCSI conversation (though it’s a false dichotomy) and not find anti-Cisco, anti-Brocade, or any other anti-FC vendor sentiment.

Even leaving aside the emotional aspects, iSCSI has allowed IT organizations to do more with less, utilizing software initiators when necessary and hardware initiators to capitalize on virtualization needs. As CPU speeds, hardware speeds, link speeds all increase, iSCSI is a flexible approach that can be easily configured by nearly anyone with TCP/IP experience.

For that reason iSCSI runs on a much, much wider base of equipment, and for customers who do not already have an existing shared storage infrastructure is a no-brainer for adoption. In fact, I can’t think of any FCoE vendor who is not also a major iSCSI supplier (though I could be wrong – it’s happened once or twice before).

4) It can be wicked fast.

Early in 2010 Microsoft and Intel hit some very impressive numbers with iSCSI performance and SSDs in a highly tuned setup. Even at 4k block sizes, they pulled in over 600k in IOPS performance, which means about 4 GB/s of storage bandwidth. Regardless of how tuned that is, it’s still impressive enough to force FC admins to sit up and take notice.

5) It’s routable

For some reason I find this to be a common thread among anti-FCoE commenters. iSCSI, by its nature of being TCP/IP, allows itself to be routable over WAN. But, apparently it’s a crucial “pro” for iSCSI when comparing it to FCoE.

Personally, it seems to me that “routable” != multi-hop. While FCoE’s multi-hop standards ratification have yet to occur (as of this writing), it doesn’t appear to me to be a strict apples-to-apples comparison.

6) It’s evolving (too!)

It’s important to remember that iSCSI can run at 10Gb speeds, just like FCoE. In fact, it can run on the same CNAs and enterprise switches (not just the CEE switches that Brocade and Cisco have released in the past year). It’s got a mature software initiation for both Microsoft and Linux OSes, and there are other advanced features of IP that have been emerging in the storage world, only a few of which have actually been put into widespread practice.

This entry is long enough without belaboring further, but given these assumptions outlined above, here are the questions worth exploring:

  1. Is iSCSI fast enough at 10GbE, 40GbE, and 100GbE – even with overhead concerns, to rival the performance and efficiency of FCoE?
  2. Does iSCSI provide the same type of traffic flexibility that FCoE can, for the same level of service?
  3. Does iSCSI provide the same type of legacy support and future-proofing that FCoE does?
  4. Does iSCSI’s market maturity create a barrier to entry for FCoE, and if so in which markets?
  5. Does iSCSI provide expansion capabilities similar to FCoE when 40- and 100GbE emerge into the market?

As I mentioned before, my initial interpretation is that FCoE “vs.” iSCSI is a false dichotomy and, quite frankly, asking the wrong questions. CEE is an overarching umbrella that can, conceivably, encompass both protocols in the right circumstances.

Several of these questions have been addressed elsewhere on the Intarwebs but I reiterate them here because I want to re-examine my own initial assumptions in light of where iSCSI is in terms of adoption and where FCoE’s adoption curve can hiccup:

  • If FCoE has a slowdown in adoption just when iSCSI surges forward, could that kill FCoE’s momentum altogether?
  • If FCoE’s momentum would slow, what would the likely reasons be?
  • If you were in charge of a DC, with or without existing FC infrastructure, is FCoE already a foregone conclusion?
  • If the FC vendors are continuing to push forward with 16Gb and 32Gb FC, as QLogic has already admitted, where exactly does FCoE fit in?
  • How should these questions affect DC customers plans for upgrading their infrastructure?

No matter what the answers to these questions are, they should at the very least be asked and understood. So, that’s my (rather lengthy) brainstorming “thinking-out-loud” for the day.

Update: Stu Miniman reminds me of the issues surrounding scalability. iSCSI is easy and manageable for small implementations, but nothing like what you can do for FC at a massive scale (that’s a quote from his tweet, btw). So the question therefore becomes, where is the crossover range for administration capabilities/ease-of-use for FCoE and iSCSI?

You can subscribe to this blog to get notifications of future articles in the column on the right. You can also follow me on Twitter: @jmichelmetz